官术网_书友最值得收藏!

Unstructured

Unstructured data consists of any dataset that does not have a predefined organizational schema as in the table in the prior section. Spoken words, music, videos, and even books, including this one, would be considered unstructured. This by no means implies that the content doesn’t have organization. Indeed, a book has a table of contents, chapters, subchapters, and an index--in that sense, it follows a definite organization.

However, it would be futile to represent every word and sentence as being part of a strict set of rules. A sentence can consist of words, numbers, punctuation marks, and so on and does not have a predefined data type as spreadsheets do. To be structured, the book would need to have an exact set of characteristics in every sentence, which would be both unreasonable and impractical.

Data from social media, such as posts on Twitter, messages from friends on Facebook, and photos on Instagram, are all examples of unstructured data.

Unstructured data can be stored in various formats. They can be Blobs or, in the case of textual data, freeform text held in a data storage medium. For textual data, technologies such as Lucene/Solr, Elasticsearch, and others are generally used to query, index, and other operations.

主站蜘蛛池模板: 米林县| 洪泽县| 兰考县| 苍梧县| 武义县| 安达市| 彝良县| 灯塔市| 乐山市| 彩票| 桃源县| 馆陶县| 苏州市| 肇庆市| 蓝田县| 锦州市| 佳木斯市| 鄂尔多斯市| 九台市| 南靖县| 班戈县| 贵港市| 阳原县| 孟州市| 黎川县| 莲花县| 无棣县| 临沂市| 苏州市| 南涧| 西城区| 会东县| 永安市| 襄垣县| 原平市| 镇雄县| 海阳市| 岳普湖县| 于田县| 高要市| 霍州市|