- Index 更新時間:2021-07-16 20:14:41
- Summary
- Python with Apache Spark
- Pig
- File handling with Hadoopy
- Python MapReduce
- What is Hadoop?
- Chapter 12. Leveraging Python in the World of Big Data
- Summary
- Performing sentiment analysis on world leaders using Twitter
- The Stanford Named Entity Recognizer
- Stemming and lemmatization
- Parts of speech tagging
- Word and sentence tokenization
- Creating a wordcloud
- Preprocessing data
- Chapter 11. Analyzing Unstructured Data with Text Mining
- Summary
- Clustering the countries
- The k-means clustering with countries
- The k-means algorithm and its working
- Chapter 10. Applying Segmentation with k-means Clustering
- Summary
- Random forests
- Decision trees
- The census income dataset
- Chapter 9. Pushing Boundaries with Ensemble Models
- Summary
- Item-based collaborative filtering
- User-based collaborative filtering
- Recommendation data
- Chapter 8. Generating Recommendations with Collaborative Filtering
- Summary
- Logistic regression
- Chapter 7. Estimating the Likelihood of Events
- Summary
- Training and testing a model
- Multiple regression
- Simple linear regression
- Chapter 6. Performing Predictions with a Linear Regression
- Summary
- Hierarchical clustering
- The k-means clustering
- The naive Bayes classifier
- Logistic regression
- Linear regression
- Decision trees
- Different types of machine learning
- Chapter 5. Uncovering Machine Learning
- Summary
- A 3D plot of a surface
- Trellis plots
- Hexagon bin plots
- Bubble charts
- Area plots
- A scatter plot matrix
- Scatter plots with histograms
- Heatmaps
- Box plots
- Styling your plots
- Playing with text
- Creating multiple plots
- Controlling the line properties of a chart
- Chapter 4. Making Sense of Data through Advanced Visualization
- Summary
- Studying the Titanic
- Presenting an analysis
- What is data mining?
- Chapter 3. Finding a Needle in a Haystack
- Summary
- ANOVA
- The chi-square test of independence
- The chi-square distribution
- The F distribution
- Z-test vs T-test
- Correlation
- A confidence interval
- Type 1 and Type 2 errors
- One-tailed and two-tailed tests
- A p-value
- A z-score
- Various forms of distribution
- Chapter 2. Inferential Statistics
- Summary
- Data operations
- Data cleansing
- Empowering data analysis with pandas
- The world of arrays with NumPy
- Chapter 1. Getting Started with Raw Data
- Customer support
- Reader feedback
- Conventions
- Who this book is for
- What you need for this book
- What this book covers
- Preface
- Support files eBooks discount offers and more
- www.PacktPub.com
- About the Reviewers
- About the Author
- Credits
- Mastering Python for Data Science
- coverpage
- coverpage
- Mastering Python for Data Science
- Credits
- About the Author
- About the Reviewers
- www.PacktPub.com
- Support files eBooks discount offers and more
- Preface
- What this book covers
- What you need for this book
- Who this book is for
- Conventions
- Reader feedback
- Customer support
- Chapter 1. Getting Started with Raw Data
- The world of arrays with NumPy
- Empowering data analysis with pandas
- Data cleansing
- Data operations
- Summary
- Chapter 2. Inferential Statistics
- Various forms of distribution
- A z-score
- A p-value
- One-tailed and two-tailed tests
- Type 1 and Type 2 errors
- A confidence interval
- Correlation
- Z-test vs T-test
- The F distribution
- The chi-square distribution
- The chi-square test of independence
- ANOVA
- Summary
- Chapter 3. Finding a Needle in a Haystack
- What is data mining?
- Presenting an analysis
- Studying the Titanic
- Summary
- Chapter 4. Making Sense of Data through Advanced Visualization
- Controlling the line properties of a chart
- Creating multiple plots
- Playing with text
- Styling your plots
- Box plots
- Heatmaps
- Scatter plots with histograms
- A scatter plot matrix
- Area plots
- Bubble charts
- Hexagon bin plots
- Trellis plots
- A 3D plot of a surface
- Summary
- Chapter 5. Uncovering Machine Learning
- Different types of machine learning
- Decision trees
- Linear regression
- Logistic regression
- The naive Bayes classifier
- The k-means clustering
- Hierarchical clustering
- Summary
- Chapter 6. Performing Predictions with a Linear Regression
- Simple linear regression
- Multiple regression
- Training and testing a model
- Summary
- Chapter 7. Estimating the Likelihood of Events
- Logistic regression
- Summary
- Chapter 8. Generating Recommendations with Collaborative Filtering
- Recommendation data
- User-based collaborative filtering
- Item-based collaborative filtering
- Summary
- Chapter 9. Pushing Boundaries with Ensemble Models
- The census income dataset
- Decision trees
- Random forests
- Summary
- Chapter 10. Applying Segmentation with k-means Clustering
- The k-means algorithm and its working
- The k-means clustering with countries
- Clustering the countries
- Summary
- Chapter 11. Analyzing Unstructured Data with Text Mining
- Preprocessing data
- Creating a wordcloud
- Word and sentence tokenization
- Parts of speech tagging
- Stemming and lemmatization
- The Stanford Named Entity Recognizer
- Performing sentiment analysis on world leaders using Twitter
- Summary
- Chapter 12. Leveraging Python in the World of Big Data
- What is Hadoop?
- Python MapReduce
- File handling with Hadoopy
- Pig
- Python with Apache Spark
- Summary
- Index 更新時間:2021-07-16 20:14:41