官术网_书友最值得收藏!

NoSQL data

The Not Only Structured Query Language (NoSQL) database is not a relational database; instead, data can be stored in key-value, JSON, document, columnar, or graph formats. They are frequently used in big data and real-time applications. We will learn here how to access NoSQL data using MongoDB, and we assume you have the MongoDB server configured properly and on:

  1. We will need to establish a connection with the Mongo daemon using the MongoClient object. The following code establishes the connection to the default host, localhost , and port (27017). And it gives us access to the database:
from pymongo import MongoClient
client = MongoClient()
db = client.test
  1. In this example, we try to load the cancer dataset available in scikit-learn to the Mongo database. So, we first get the breast cancer dataset and convert it to a pandas DataFrame:
from sklearn.datasets import load_breast_cancer
import pandas as pd

cancer = load_breast_cancer()
data = pd.DataFrame(cancer.data, columns=[cancer.feature_names])

data.head()
  1. Next, we convert this into the JSON format, use the json.loads() function to decode it, and insert the decoded data into the open database:
import json
data_in_json = data.to_json(orient='split')
rows = json.loads(data_in_json)
db.cancer_data.insert(rows)

  1. This will create a collection named cancer_data that contains the data. We can query the document we just created, using the cursor object:
cursor = db['cancer_data'].find({})
df = pd.DataFrame(list(cursor))
print(df)

When it comes to distributed data on the IoT, Hadoop Distributed File System (HDFS) is another popular method for providing distributed data storage and access in IoT systems. In the next section, we study how to access and store data in HDFS.

主站蜘蛛池模板: 宁南县| 竹北市| 蓝山县| 桃园市| 北流市| 昆山市| 贵溪市| 彭阳县| 安西县| 凤翔县| 甘肃省| 清水河县| 泌阳县| 鄂托克旗| 苏尼特右旗| 饶河县| 美姑县| 许昌市| 阿克陶县| 平定县| 图木舒克市| 平度市| 邹城市| 平顶山市| 天祝| 扎鲁特旗| 双江| 观塘区| 平凉市| 高台县| 买车| 内乡县| 修水县| 信丰县| 七台河市| 饶平县| 丹棱县| 水城县| 广州市| 安丘市| 东安县|