- Learning Apache Cassandra(Second Edition)
- Sandeep Yarabarla
- 262字
- 2021-07-03 00:19:21
What is big data?
Big data is a relatively new term which has been gathering steam over the past few years. Big data is a term used for datasets that are relatively large to be stored in a traditional database system or processed by traditional data-processing pipelines. This data could be structured, semi-structured, or unstructured data. The datasets that belong to this category usually scale to terabytes or petabytes of data. Big data usually involves one or more of the following:
- Velocity: Data moves at an unprecedented speed and must be dealt with it in a timely manner.
For example, online systems, sensors, social media, web clickstream, and so on.
- Volume: Organizations collect data from a variety of sources, including business transactions, social media, and information from sensor or machine-to-machine data. This could involve terabytes to petabytes of data. In the past, storing it would've been a problem, but new technologies have eased the burden.
- Variety: Data comes in all sorts of formats ranging from structured data to be stored in traditional databases to unstructured data (blobs) such as images, audio files, and text files.
These are known as the 3Vs of big data.
In addition to these, we tend to associate another term with big data:
- Complexity: Today's data comes from multiple sources, which makes it difficult to link, match, cleanse, and transform data across systems. However, it's necessary to connect and correlate relationships, hierarchies, and multiple data linkages, or your data can quickly spiral out of control. It must be able to traverse multiple data centers, cloud, and geographical zones.
推薦閱讀
- Word 2003、Excel 2003、PowerPoint 2003上機(jī)指導(dǎo)與練習(xí)
- AutoCAD繪圖實用速查通典
- 高性能混合信號ARM:ADuC7xxx原理與應(yīng)用開發(fā)
- PowerShell 3.0 Advanced Administration Handbook
- Dreamweaver CS3網(wǎng)頁制作融會貫通
- Getting Started with Oracle SOA B2B Integration:A Hands-On Tutorial
- 機(jī)器人智能運動規(guī)劃技術(shù)
- Python Data Science Essentials
- 機(jī)器人編程實戰(zhàn)
- INSTANT Autodesk Revit 2013 Customization with .NET How-to
- Ruby on Rails敏捷開發(fā)最佳實踐
- 云原生架構(gòu)進(jìn)階實戰(zhàn)
- Hands-On Geospatial Analysis with R and QGIS
- 超好玩的Python少兒編程
- R Statistics Cookbook