- Practical Big Data Analytics
- Nataraj Dasgupta
- 208字
- 2021-07-02 19:26:18
Unstructured
Unstructured data consists of any dataset that does not have a predefined organizational schema as in the table in the prior section. Spoken words, music, videos, and even books, including this one, would be considered unstructured. This by no means implies that the content doesn’t have organization. Indeed, a book has a table of contents, chapters, subchapters, and an index--in that sense, it follows a definite organization.
However, it would be futile to represent every word and sentence as being part of a strict set of rules. A sentence can consist of words, numbers, punctuation marks, and so on and does not have a predefined data type as spreadsheets do. To be structured, the book would need to have an exact set of characteristics in every sentence, which would be both unreasonable and impractical.
Data from social media, such as posts on Twitter, messages from friends on Facebook, and photos on Instagram, are all examples of unstructured data.
Unstructured data can be stored in various formats. They can be Blobs or, in the case of textual data, freeform text held in a data storage medium. For textual data, technologies such as Lucene/Solr, Elasticsearch, and others are generally used to query, index, and other operations.
- Go Machine Learning Projects
- ETL with Azure Cookbook
- STM32G4入門與電機控制實戰(zhàn):基于X-CUBE-MCSDK的無刷直流電機與永磁同步電機控制實現(xiàn)
- 大數(shù)據(jù)安全與隱私保護
- 21天學(xué)通Visual C++
- 3D Printing for Architects with MakerBot
- 大學(xué)C/C++語言程序設(shè)計基礎(chǔ)
- 運動控制系統(tǒng)應(yīng)用與實踐
- 人工智能趣味入門:光環(huán)板程序設(shè)計
- 智能生產(chǎn)線的重構(gòu)方法
- Flink原理與實踐
- INSTANT Adobe Story Starter
- 常用傳感器技術(shù)及應(yīng)用(第2版)
- Practical Network Automation
- 輸送技術(shù)、設(shè)備與工業(yè)應(yīng)用