目錄(209章)
倒序
- 封面
- 版權頁
- Credits
- About the Author
- About the Reviewers
- www.PacktPub.com
- eBooks discount offers and more
- Preface
- Why do you need this book?
- Data analysis data science big data – what is the big deal?
- A brief of history of data analysis with Python
- A conjecture about the future
- What this book covers
- What you need for this book
- Who this book is for
- Sections
- Conventions
- Reader feedback
- Customer support
- Chapter 1. Laying the Foundation for Reproducible Data Analysis
- Introduction
- Setting up Anaconda
- Installing the Data Science Toolbox
- Creating a virtual environment with virtualenv and virtualenvwrapper
- Sandboxing Python applications with Docker images
- Keeping track of package versions and history in IPython Notebook
- Configuring IPython
- Learning to log for robust error checking
- Unit testing your code
- Configuring pandas
- Configuring matplotlib
- Seeding random number generators and NumPy print options
- Standardizing reports code style and data access
- Chapter 2. Creating Attractive Data Visualizations
- Introduction
- Graphing Anscombe's quartet
- Choosing seaborn color palettes
- Choosing matplotlib color maps
- Interacting with IPython Notebook widgets
- Viewing a matrix of scatterplots
- Visualizing with d3.js via mpld3
- Creating heatmaps
- Combining box plots and kernel density plots with violin plots
- Visualizing network graphs with hive plots
- Displaying geographical maps
- Using ggplot2-like plots
- Highlighting data points with influence plots
- Chapter 3. Statistical Data Analysis and Probability
- Introduction
- Fitting data to the exponential distribution
- Fitting aggregated data to the gamma distribution
- Fitting aggregated counts to the Poisson distribution
- Determining bias
- Estimating kernel density
- Determining confidence intervals for mean variance and standard deviation
- Sampling with probability weights
- Exploring extreme values
- Correlating variables with Pearson's correlation
- Correlating variables with the Spearman rank correlation
- Correlating a binary and a continuous variable with the point biserial correlation
- Evaluating relations between variables with ANOVA
- Chapter 4. Dealing with Data and Numerical Issues
- Introduction
- Clipping and filtering outliers
- Winsorizing data
- Measuring central tendency of noisy data
- Normalizing with the Box-Cox transformation
- Transforming data with the power ladder
- Transforming data with logarithms
- Rebinning data
- Applying logit() to transform proportions
- Fitting a robust linear model
- Taking variance into account with weighted least squares
- Using arbitrary precision for optimization
- Using arbitrary precision for linear algebra
- Chapter 5. Web Mining Databases and Big Data
- Introduction
- Simulating web browsing
- Scraping the Web
- Dealing with non-ASCII text and HTML entities
- Implementing association tables
- Setting up database migration scripts
- Adding a table column to an existing table
- Adding indices after table creation
- Setting up a test web server
- Implementing a star schema with fact and dimension tables
- Using HDFS
- Setting up Spark
- Clustering data with Spark
- Chapter 6. Signal Processing and Timeseries
- Introduction
- Spectral analysis with periodograms
- Estimating power spectral density with the Welch method
- Analyzing peaks
- Measuring phase synchronization
- Exponential smoothing
- Evaluating smoothing
- Using the Lomb-Scargle periodogram
- Analyzing the frequency spectrum of audio
- Analyzing signals with the discrete cosine transform
- Block bootstrapping time series data
- Moving block bootstrapping time series data
- Applying the discrete wavelet transform
- Chapter 7. Selecting Stocks with Financial Data Analysis
- Introduction
- Computing simple and log returns
- Ranking stocks with the Sharpe ratio and liquidity
- Ranking stocks with the Calmar and Sortino ratios
- Analyzing returns statistics
- Correlating individual stocks with the broader market
- Exploring risk and return
- Examining the market with the non-parametric runs test
- Testing for random walks
- Determining market efficiency with autoregressive models
- Creating tables for a stock prices database
- Populating the stock prices database
- Optimizing an equal weights two-asset portfolio
- Chapter 8. Text Mining and Social Network Analysis
- Introduction
- Creating a categorized corpus
- Tokenizing news articles in sentences and words
- Stemming lemmatizing filtering and TF-IDF scores
- Recognizing named entities
- Extracting topics with non-negative matrix factorization
- Implementing a basic terms database
- Computing social network density
- Calculating social network closeness centrality
- Determining the betweenness centrality
- Estimating the average clustering coefficient
- Calculating the assortativity coefficient of a graph
- Getting the clique number of a graph
- Creating a document graph with cosine similarity
- Chapter 9. Ensemble Learning and Dimensionality Reduction
- Introduction
- Recursively eliminating features
- Applying principal component analysis for dimension reduction
- Applying linear discriminant analysis for dimension reduction
- Stacking and majority voting for multiple models
- Learning with random forests
- Fitting noisy data with the RANSAC algorithm
- Bagging to improve results
- Boosting for better learning
- Nesting cross-validation
- Reusing models with joblib
- Hierarchically clustering data
- Taking a Theano tour
- Chapter 10. Evaluating Classifiers Regressors and Clusters
- Introduction
- Getting classification straight with the confusion matrix
- Computing precision recall and F1-score
- Examining a receiver operating characteristic and the area under a curve
- Visualizing the goodness of fit
- Computing MSE and median absolute error
- Evaluating clusters with the mean silhouette coefficient
- Comparing results with a dummy classifier
- Determining MAPE and MPE
- Comparing with a dummy regressor
- Calculating the mean absolute error and the residual sum of squares
- Examining the kappa of classification
- Taking a look at the Matthews correlation coefficient
- Chapter 11. Analyzing Images
- Introduction
- Setting up OpenCV
- Applying Scale-Invariant Feature Transform (SIFT)
- Detecting features with SURF
- Quantizing colors
- Denoising images
- Extracting patches from an image
- Detecting faces with Haar cascades
- Searching for bright stars
- Extracting metadata from images
- Extracting texture features from images
- Applying hierarchical clustering on images
- Segmenting images with spectral clustering
- Chapter 12. Parallelism and Performance
- Introduction
- Just-in-time compiling with Numba
- Speeding up numerical expressions with Numexpr
- Running multiple threads with the threading module
- Launching multiple tasks with the concurrent.futures module
- Accessing resources asynchronously with the asyncio module
- Distributed processing with execnet
- Profiling memory usage
- Calculating the mean variance skewness and kurtosis on the fly
- Caching with a least recently used cache
- Caching HTTP requests
- Streaming counting with the Count-min sketch
- Harnessing the power of the GPU with OpenCL
- Appendix A. Glossary
- Appendix B. Function Reference
- IPython
- Matplotlib
- NumPy
- pandas
- Scikit-learn
- SciPy
- Seaborn
- Statsmodels
- Appendix C. Online Resources
- IPython notebooks and open data
- Mathematics and statistics
- Appendix D. Tips and Tricks for Command-Line and Miscellaneous Tools
- IPython notebooks
- Command-line tools
- The alias command
- Command-line history
- Reproducible sessions
- Docker tips
- Index 更新時間:2021-07-14 11:06:29
推薦閱讀
- ServiceNow Application Development
- Java程序設計與開發(fā)
- Node.js 10實戰(zhàn)
- Visual Basic程序設計實驗指導(第4版)
- Modern JavaScript Applications
- Unreal Engine 4 Shaders and Effects Cookbook
- 編程數(shù)學
- Windows內(nèi)核編程
- Natural Language Processing with Java and LingPipe Cookbook
- 小程序,巧應用:微信小程序開發(fā)實戰(zhàn)(第2版)
- 大話Java:程序設計從入門到精通
- Java Web從入門到精通(第2版)
- 邊玩邊學Scratch3.0少兒趣味編程
- Greenplum構建實時數(shù)據(jù)倉庫實踐
- Python Social Media Analytics
- Spring Boot從入門到實戰(zhàn)
- Clojure Data Structures and Algorithms Cookbook
- PHP從入門到精通(第7版)
- Ajax與jQuery程序設計
- D Cookbook
- 機器學習開發(fā)者指南
- 來吧,一起創(chuàng)客
- 深入理解Java虛擬機:JVM高級特性與最佳實踐(第3版)
- Mastering Cross:Platform Development with Xamarin
- C語言程序設計習題解析與實驗指導(第3版)
- Elasticsearch Indexing
- Android Things Projects
- Python工匠:案例、技巧與工程實踐
- 實用軟件架構:從系統(tǒng)環(huán)境到軟件部署
- Java編程講義