舉報

會員
Big Data Analysis with Python
Processingbigdatainrealtimeischallengingduetoscalability,informationinconsistency,andfaulttolerance.BigDataAnalysiswithPythonteachesyouhowtousetoolsthatcancontrolthisdataavalancheforyou.Withthisbook,you'lllearnpracticaltechniquestoaggregatedataintousefuldimensionsforposterioranalysis,extractstatisticalmeasurements,andtransformdatasetsintofeaturesforothersystems.ThebookbeginswithanintroductiontodatamanipulationinPythonusingpandas.You'llthengetfamiliarwithstatisticalanalysisandplottingtechniques.Withmultiplehands-onactivitiesinstore,you'llbeabletoanalyzedatathatisdistributedonseveralcomputersbyusingDask.Asyouprogress,you'llstudyhowtoaggregatedataforplotswhentheentiredatacannotbeaccommodatedinmemory.You'llalsoexploreHadoop(HDFSandYARN),whichwillhelpyoutacklelargerdatasets.ThebookalsocoversSparkandexplainshowitinteractswithothertools.Bytheendofthisbook,you'llbeabletobootstrapyourownPythonenvironment,processlargefiles,andmanipulatedatatogeneratestatistics,metrics,andgraphs.
目錄(73章)
倒序
- 封面
- 版權頁
- Preface
- Chapter 1 The Python Data Science Stack
- Introduction
- Python Libraries and Packages
- Using Pandas
- Data Type Conversion
- Aggregation and Grouping
- Exporting Data from Pandas
- Visualization with Pandas
- Summary
- Chapter 2 Statistical Visualizations
- Introduction
- Types of Graphs and When to Use Them
- Components of a Graph
- Seaborn
- Which Tool Should Be Used?
- Types of Graphs
- Pandas DataFrames and Grouped Data
- Changing Plot Design: Modifying Graph Components
- Exporting Graphs
- Summary
- Chapter 3 Working with Big Data Frameworks
- Introduction
- Hadoop
- Spark
- Writing Parquet Files
- Handling Unstructured Data
- Summary
- Chapter 4 Diving Deeper with Spark
- Introduction
- Getting Started with Spark DataFrames
- Writing Output from Spark DataFrames
- Exploring Spark DataFrames
- Data Manipulation with Spark DataFrames
- Graphs in Spark
- Summary
- Chapter 5 Handling Missing Values and Correlation Analysis
- Introduction
- Setting up the Jupyter Notebook
- Missing Values
- Handling Missing Values in Spark DataFrames
- Correlation
- Summary
- Chapter 6 Exploratory Data Analysis
- Introduction
- Defining a Business Problem
- Translating a Business Problem into Measurable Metrics and Exploratory Data Analysis (EDA)
- Structured Approach to the Data Science Project Life Cycle
- Summary
- Chapter 7 Reproducibility in Big Data Analysis
- Introduction
- Reproducibility with Jupyter Notebooks
- Gathering Data in a Reproducible Way
- Code Practices and Standards
- Avoiding Repetition
- Summary
- Chapter 8 Creating a Full Analysis Report
- Introduction
- Reading Data in Spark from Different Data Sources
- SQL Operations on a Spark DataFrame
- Generating Statistical Measurements
- Summary
- Appendix
- Chapter 01: The Python Data Science Stack
- Chapter 02: Statistical Visualizations Using Matplotlib and Seaborn
- Chapter 03: Working with Big Data Frameworks
- Chapter 04: Diving Deeper with Spark
- Chapter 05: Missing Value Handling and Correlation Analysis in Spark
- Chapter 6: Business Process Definition and Exploratory Data Analysis
- Chapter 07: Reproducibility in Big Data Analysis
- Chapter 08: Creating a Full Analysis Report 更新時間:2021-06-11 13:46:55
推薦閱讀
- 后稀缺:自動化與未來工作
- Mastercam 2017數控加工自動編程經典實例(第4版)
- 數據中心建設與管理指南
- UTM(統一威脅管理)技術概論
- 傳感器技術應用
- Python Data Science Essentials
- 大數據安全與隱私保護
- C語言寶典
- 嵌入式操作系統
- 運動控制系統應用與實踐
- 工業機器人安裝與調試
- Dreamweaver CS6精彩網頁制作與網站建設
- HTML5 Canvas Cookbook
- 大數據技術基礎:基于Hadoop與Spark
- 計算機應用基礎實訓·職業模塊
- Serverless Design Patterns and Best Practices
- Eclipse全程指南
- 大話數據科學:大數據與機器學習實戰(基于R語言)
- SolarWinds Server & Application Monitor:Deployment and Administration
- 工程地質地學信息遙感自動提取技術
- iLike就業SQL多功能教材
- R:Predictive Analysis
- Python Data Mining Quick Start Guide
- KUKA工業機器人與西門子S7-1200 PLC技術及應用
- Alexa Skills Projects
- 無線傳感器網絡節能、優化與可生存性
- Python數據挖掘入門與實踐
- Photoshop CS4中文版平面設計100例
- 人人可懂的數據科學
- 數據庫應用基礎(Access 2003)