舉報

會員
Learning Data Mining with Python(Second Edition)
最新章節:
Coursera
IfyouareaPythonprogrammerwhowantstogetstartedwithdatamining,thenthisbookisforyou.IfyouareadataanalystwhowantstoleveragethepowerofPythontoperformdataminingefficiently,thisbookwillalsohelpyou.Nopreviousexperiencewithdataminingisexpected.
目錄(268章)
倒序
- 封面
- 版權信息
- Credits
- About the Author
- About the Reviewer
- www.PacktPub.com
- Customer Feedback
- Preface
- What this book covers
- What you need for this book
- Who this book is for
- Conventions
- Reader feedback
- Customer support
- Downloading the example code
- Errata
- Piracy
- Questions
- Getting Started with Data Mining
- Introducing data mining
- Using Python and the Jupyter Notebook
- Installing Python
- Installing Jupyter Notebook
- Installing scikit-learn
- A simple affinity analysis example
- What is affinity analysis?
- Product recommendations
- Loading the dataset with NumPy
- Downloading the example code
- Implementing a simple ranking of rules
- Ranking to find the best rules
- A simple classification example
- What is classification?
- Loading and preparing the dataset
- Implementing the OneR algorithm
- Testing the algorithm
- Summary
- Classifying with scikit-learn Estimators
- scikit-learn estimators
- Nearest neighbors
- Distance metrics
- Loading the dataset
- Moving towards a standard workflow
- Running the algorithm
- Setting parameters
- Preprocessing
- Standard pre-processing
- Putting it all together
- Pipelines
- Summary
- Predicting Sports Winners with Decision Trees
- Loading the dataset
- Collecting the data
- Using pandas to load the dataset
- Cleaning up the dataset
- Extracting new features
- Decision trees
- Parameters in decision trees
- Using decision trees
- Sports outcome prediction
- Putting it all together
- Random forests
- How do ensembles work?
- Setting parameters in Random Forests
- Applying random forests
- Engineering new features
- Summary
- Recommending Movies Using Affinity Analysis
- Affinity analysis
- Algorithms for affinity analysis
- Overall methodology
- Dealing with the movie recommendation problem
- Obtaining the dataset
- Loading with pandas
- Sparse data formats
- Understanding the Apriori algorithm and its implementation
- Looking into the basics of the Apriori algorithm
- Implementing the Apriori algorithm
- Extracting association rules
- Evaluating the association rules
- Summary
- Features and scikit-learn Transformers
- Feature extraction
- Representing reality in models
- Common feature patterns
- Creating good features
- Feature selection
- Selecting the best individual features
- Feature creation
- Principal Component Analysis
- Creating your own transformer
- The transformer API
- Implementing a Transformer
- Unit testing
- Putting it all together
- Summary
- Social Media Insight using Naive Bayes
- Disambiguation
- Downloading data from a social network
- Loading and classifying the dataset
- Creating a replicable dataset from Twitter
- Text transformers
- Bag-of-words models
- n-gram features
- Other text features
- Naive Bayes
- Understanding Bayes' theorem
- Naive Bayes algorithm
- How it works
- Applying of Naive Bayes
- Extracting word counts
- Converting dictionaries to a matrix
- Putting it all together
- Evaluation using the F1-score
- Getting useful features from models
- Summary
- Follow Recommendations Using Graph Mining
- Loading the dataset
- Classifying with an existing model
- Getting follower information from Twitter
- Building the network
- Creating a graph
- Creating a similarity graph
- Finding subgraphs
- Connected components
- Optimizing criteria
- Summary
- Beating CAPTCHAs with Neural Networks
- Artificial neural networks
- An introduction to neural networks
- Creating the dataset
- Drawing basic CAPTCHAs
- Splitting the image into individual letters
- Creating a training dataset
- Training and classifying
- Back-propagation
- Predicting words
- Improving accuracy using a dictionary
- Ranking mechanisms for word similarity
- Putting it all together
- Summary
- Authorship Attribution
- Attributing documents to authors
- Applications and use cases
- Authorship attribution
- Getting the data
- Using function words
- Counting function words
- Classifying with function words
- Support Vector Machines
- Classifying with SVMs
- Kernels
- Character n-grams
- Extracting character n-grams
- The Enron dataset
- Accessing the Enron dataset
- Creating a dataset loader
- Putting it all together
- Evaluation
- Summary
- Clustering News Articles
- Trending topic discovery
- Using a web API to get data
- Reddit as a data source
- Getting the data
- Extracting text from arbitrary websites
- Finding the stories in arbitrary websites
- Extracting the content
- Grouping news articles
- The k-means algorithm
- Evaluating the results
- Extracting topic information from clusters
- Using clustering algorithms as transformers
- Clustering ensembles
- Evidence accumulation
- How it works
- Implementation
- Online learning
- Implementation
- Summary
- Object Detection in Images using Deep Neural Networks
- Object classification
- Use cases
- Application scenario
- Deep neural networks
- Intuition
- Implementing deep neural networks
- An Introduction to TensorFlow
- Using Keras
- Convolutional Neural Networks
- GPU optimization
- When to use GPUs for computation
- Running our code on a GPU
- Setting up the environment
- Application
- Getting the data
- Creating the neural network
- Putting it all together
- Summary
- Working with Big Data
- Big data
- Applications of big data
- MapReduce
- The intuition behind MapReduce
- A word count example
- Hadoop MapReduce
- Applying MapReduce
- Getting the data
- Naive Bayes prediction
- The mrjob package
- Extracting the blog posts
- Training Naive Bayes
- Putting it all together
- Training on Amazon's EMR infrastructure
- Summary
- Next Steps...
- Getting Started with Data Mining
- Scikit-learn tutorials
- Extending the Jupyter Notebook
- More datasets
- Other Evaluation Metrics
- More application ideas
- Classifying with scikit-learn Estimators
- Scalability with the nearest neighbor
- More complex pipelines
- Comparing classifiers
- Automated Learning
- Predicting Sports Winners with Decision Trees
- More complex features
- Dask
- Research
- Recommending Movies Using Affinity Analysis
- New datasets
- The Eclat algorithm
- Collaborative Filtering
- Extracting Features with Transformers
- Adding noise
- Vowpal Wabbit
- word2vec
- Social Media Insight Using Naive Bayes
- Spam detection
- Natural language processing and part-of-speech tagging
- Discovering Accounts to Follow Using Graph Mining
- More complex algorithms
- NetworkX
- Beating CAPTCHAs with Neural Networks
- Better (worse?) CAPTCHAs
- Deeper networks
- Reinforcement learning
- Authorship Attribution
- Increasing the sample size
- Blogs dataset
- Local n-grams
- Clustering News Articles
- Clustering Evaluation
- Temporal analysis
- Real-time clusterings
- Classifying Objects in Images Using Deep Learning
- Mahotas
- Magenta
- Working with Big Data
- Courses on Hadoop
- Pydoop
- Recommendation engine
- W.I.L.L
- More resources
- Kaggle competitions
- Coursera 更新時間:2021-07-02 23:40:49
推薦閱讀
- Oracle WebLogic Server 12c:First Look
- 零起步玩轉掌控板與Mind+
- 趣學Python算法100例
- jQuery從入門到精通 (軟件開發視頻大講堂)
- Python機器學習:手把手教你掌握150個精彩案例(微課視頻版)
- Learning R for Geospatial Analysis
- Android移動開發案例教程:基于Android Studio開發環境
- Extending Unity with Editor Scripting
- Learning iOS Security
- Mobile Forensics:Advanced Investigative Strategies
- Learning C++ by Creating Games with UE4
- Xamarin Cross-Platform Development Cookbook
- Android嵌入式系統程序開發(基于Cortex-A8)
- Eclipse開發(學習筆記)
- 軟件測試
- C語言從入門到精通(第4版)
- Vue.js從入門到精通
- AVR單片機C語言非常入門與視頻演練
- 大話統計學(溢彩實訓版):基于R語言+中文統計工具
- Go語言從入門到項目實戰(視頻版)
- Python機器學習
- 瘋狂Java講義(第5版)
- Vue.js+Node.js開發實戰:從入門到項目上線
- Python數據分析實戰
- 構建高質量軟件:持續集成與持續交付系統實踐
- Vagrant Virtual Development Environment Cookbook
- Jenkins Essentials(Second Edition)
- Visual Basic編程全能詞典
- Apache Spark 2:Data Processing and Real-Time Analytics
- Python編程課