- Practical Big Data Analytics
- Nataraj Dasgupta
- 177字
- 2021-07-02 19:26:26
The fundamental premise of Hadoop
The fundamental premise of Hadoop is that instead of attempting to perform a task on a single large machine, the task can be subpided into smaller segments that can then be delegated to multiple smaller machines. These so-called smaller machines would then perform the task on their own portion of the data. Once the smaller machines have completed their tasks to produce the results on the tasks they were allocated, the inpidual units of results would then be aggregated to produce the final result.
Although, in theory, this may appear relatively simple, there are various technical considerations to bear in mind. For example:
- Is the network fast enough to collect the results from each inpidual server?
- Can each inpidual server read data fast enough from the disk?
- If one or more of the servers fail, do we have to start all over?
- If there are multiple large tasks, how should they be prioritized?
There are many more such considerations that must be considered when working with a distributed architecture of this nature.
- Unreal Engine:Game Development from A to Z
- R Data Mining
- Getting Started with Clickteam Fusion
- Managing Mission:Critical Domains and DNS
- JMAG電機電磁仿真分析與實例解析
- 小型電動機實用設計手冊
- 大型數據庫管理系統技術、應用與實例分析:SQL Server 2005
- 控制系統計算機仿真
- 統計學習理論與方法:R語言版
- Troubleshooting OpenVPN
- 數據掘金
- Excel 2007常見技法與行業應用實例精講
- 氣動系統裝調與PLC控制
- 網絡管理工具實用詳解
- 統計挖掘與機器學習:大數據預測建模和分析技術(原書第3版)