目錄(120章)
倒序
- 封面
- 版權信息
- Credits
- About the Author
- Acknowledgements
- About the Reviewer
- www.PacktPub.com
- Preface
- Chapter 1. Getting Started with Hadoop 2.X
- Introduction
- Installing a single-node Hadoop Cluster
- Installing a multi-node Hadoop cluster
- Adding new nodes to existing Hadoop clusters
- Executing the balancer command for uniform data distribution
- Entering and exiting from the safe mode in a Hadoop cluster
- Decommissioning DataNodes
- Performing benchmarking on a Hadoop cluster
- Chapter 2. Exploring HDFS
- Introduction
- Loading data from a local machine to HDFS
- Exporting HDFS data to a local machine
- Changing the replication factor of an existing file in HDFS
- Setting the HDFS block size for all the files in a cluster
- Setting the HDFS block size for a specific file in a cluster
- Enabling transparent encryption for HDFS
- Importing data from another Hadoop cluster
- Recycling deleted data from trash to HDFS
- Saving compressed data in HDFS
- Chapter 3. Mastering Map Reduce Programs
- Introduction
- Writing the Map Reduce program in Java to analyze web log data
- Executing the Map Reduce program in a Hadoop cluster
- Adding support for a new writable data type in Hadoop
- Implementing a user-defined counter in a Map Reduce program
- Map Reduce program to find the top X
- Map Reduce program to find distinct values
- Map Reduce program to partition data using a custom partitioner
- Writing Map Reduce results to multiple output files
- Performing Reduce side Joins using Map Reduce
- Unit testing the Map Reduce code using MRUnit
- Chapter 4. Data Analysis Using Hive Pig and Hbase
- Introduction
- Storing and processing Hive data in a sequential file format
- Storing and processing Hive data in the RC file format
- Storing and processing Hive data in the ORC file format
- Storing and processing Hive data in the Parquet file format
- Performing FILTER By queries in Pig
- Performing Group By queries in Pig
- Performing Order By queries in Pig
- Performing JOINS in Pig
- Writing a user-defined function in Pig
- Analyzing web log data using Pig
- Performing the Hbase operation in CLI
- Performing Hbase operations in Java
- Executing the MapReduce programming with an Hbase Table
- Chapter 5. Advanced Data Analysis Using Hive
- Introduction
- Processing JSON data in Hive using JSON SerDe
- Processing XML data in Hive using XML SerDe
- Processing Hive data in the Avro format
- Writing a user-defined function in Hive
- Performing table joins in Hive
- Executing map side joins in Hive
- Performing context Ngram in Hive
- Call Data Record Analytics using Hive
- Twitter sentiment analysis using Hive
- Implementing Change Data Capture using Hive
- Multiple table inserting using Hive
- Chapter 6. Data Import/Export Using Sqoop and Flume
- Introduction
- Importing data from RDMBS to HDFS using Sqoop
- Exporting data from HDFS to RDBMS
- Using query operator in Sqoop import
- Importing data using Sqoop in compressed format
- Performing Atomic export using Sqoop
- Importing data into Hive tables using Sqoop
- Importing data into HDFS from Mainframes
- Incremental import using Sqoop
- Creating and executing Sqoop job
- Importing data from RDBMS to Hbase using Sqoop
- Importing Twitter data into HDFS using Flume
- Importing data from Kafka into HDFS using Flume
- Importing web logs data into HDFS using Flume
- Chapter 7. Automation of Hadoop Tasks Using Oozie
- Introduction
- Implementing a Sqoop action job using Oozie
- Implementing a Map Reduce action job using Oozie
- Implementing a Java action job using Oozie
- Implementing a Hive action job using Oozie
- Implementing a Pig action job using Oozie
- Implementing an e-mail action job using Oozie
- Executing parallel jobs using Oozie (fork)
- Scheduling a job in Oozie
- Chapter 8. Machine Learning and Predictive Analytics Using Mahout and R
- Introduction
- Setting up the Mahout development environment
- Creating an item-based recommendation engine using Mahout
- Creating a user-based recommendation engine using Mahout
- Predictive analytics on Bank Data using Mahout
- Text data clustering using K-Means using Mahout
- Population Data Analytics using R
- Twitter Sentiment Analytics using R
- Performing Predictive Analytics using R
- Chapter 9. Integration with Apache Spark
- Introduction
- Running Spark standalone
- Running Spark on YARN
- Performing Olympics Athletes analytics using the Spark Shell
- Creating Twitter trending topics using Spark Streaming
- Twitter trending topics using Spark streaming
- Analyzing Parquet files using Spark
- Analyzing JSON data using Spark
- Processing graphs using Graph X
- Conducting predictive analytics using Spark MLib
- Chapter 10. Hadoop Use Cases
- Introduction
- Call Data Record analytics
- Web log analytics
- Sensitive data masking and encryption using Hadoop
- Index 更新時間:2021-07-09 20:03:08
推薦閱讀
- ArchiCAD 19:The Definitive Guide
- 大學計算機基礎:基礎理論篇
- LabVIEW虛擬儀器從入門到測控應用130例
- Spark編程基礎(Scala版)
- 傳感器技術應用
- Blender Compositing and Post Processing
- Artificial Intelligence By Example
- ASP.NET 2.0 Web開發入門指南
- Dreamweaver+Photoshop+Flash+Fireworks網站建設與網頁設計完全實用
- Mastering OpenStack(Second Edition)
- C#求職寶典
- 傳感器原理與工程應用
- 項目實踐精解:C#核心技術應用開發
- 網絡安全概論
- 菜鳥起飛五筆打字高手
- 巧學活用Linux
- 系統建模與控制導論
- Flash 8中文版全程自學手冊
- Microsoft Power BI Complete Reference
- 白話機器學習算法
- Hands-On Neural Networks with TensorFlow 2.0
- 機器人力觸覺感知技術
- 數據庫應用基礎學習指導
- Hands-On Serverless Computing
- 單片機原理及應用技術
- CentOS System Administration Essentials
- Learn MongoDB 4.x
- 深入淺出PyTorch:從模型到源碼
- Photoshop CS3中文版圖像處理與平面設計精彩百練
- C51單片機編程與應用