舉報

會員
Natural Language Processing with Python Quick Start Guide
NLPinPythonisamongthemostsoughtafterskillsamongdatascientists.Withcodeandrelevantcasestudies,thisbookwillshowhowyoucanuseindustry-gradetoolstoimplementNLPprogramscapableoflearningfromrelevantdata.WewillexploremanymodernmethodsrangingfromspaCytowordvectorsthathavereinventedNLP.ThebooktakesyoufromthebasicsofNLPtobuildingtextprocessingapplications.Westartwithanintroductiontothebasicvocabularyalongwithawork?owforbuildingNLPapplications.Weuseindustry-gradeNLPtoolsforcleaningandpre-processingtext,automaticquestionandanswergenerationusinglinguistics,textembedding,textclassifier,andbuildingachatbot.Witheachproject,youwilllearnanewconceptofNLP.Youwilllearnaboutentityrecognition,partofspeechtagginganddependencyparsingforQandA.Weusetextembeddingforbothclusteringdocumentsandmakingchatbots,andthenbuildclassifiersusingscikit-learn.WeconcludebydeployingthesemodelsasRESTAPIswithFlask.Bytheend,youwillbeconfidentbuildingNLPapplications,andknowexactlywhattolookforwhenapproachingnewchallenges.
目錄(176章)
倒序
- coverpage
- Title Page
- About Packt
- Why subscribe?
- Packt.com
- Contributors
- About the author
- About the reviewer
- Packt is searching for authors like you
- Preface
- Who this book is for
- What this book covers
- To get the most out of this book
- Download the example code files
- Download the color images
- Conventions used
- Get in touch
- Reviews
- Getting Started with Text Classification
- What is NLP?
- Why learn about NLP?
- You have a problem in mind
- Technical achievement
- Do something new
- Is this book for you?
- NLP workflow template
- Understanding the problem
- Understanding and preparing the data
- Quick wins – proof of concept
- Iterating and improving
- Algorithms
- Pre-processing
- Evaluation and deployment
- Evaluation
- Deployment
- Example – text classification workflow
- Launchpad – programming environment setup
- Text classification in 30 lines of code
- Getting the data
- Text to numbers
- Machine learning
- Summary
- Tidying your Text
- Bread and butter – most common tasks
- Loading the data
- Exploring the loaded data
- Tokenization
- Intuitive – split by whitespace
- The hack – splitting by word extraction
- Introducing Regexes
- spaCy for tokenization
- How does the spaCy tokenizer work?
- Sentence tokenization
- Stop words removal and case change
- Stemming and lemmatization
- spaCy for lemmatization
- -PRON-
- Case-insensitive
- Conversion – meeting to meet
- spaCy compared with NLTK and CoreNLP
- Correcting spelling
- FuzzyWuzzy
- Jellyfish
- Phonetic word similarity
- What is a phonetic encoding?
- Runtime complexity
- Cleaning a corpus with FlashText
- Summary
- Leveraging Linguistics
- Linguistics and NLP
- Getting started
- Introducing textacy
- Redacting names with named entity recognition
- Entity types
- Automatic question generation
- Part-of-speech tagging
- Creating a ruleset
- Question and answer generation using dependency parsing
- Visualizing the relationship
- Introducing textacy
- Leveling up – question and answer
- Putting it together and the end
- Summary
- Text Representations - Words to Numbers
- Vectorizing a specific dataset
- Word representations
- How do we use pre-trained embeddings?
- KeyedVectors API
- What is missing in both word2vec and GloVe?
- How do we handle Out Of Vocabulary words?
- Getting the dataset
- Training fastText embedddings
- Training word2vec embeddings
- fastText versus word2vec
- Document embedding
- Understanding the doc2vec API
- Negative sampling
- Hierarchical softmax
- Data exploration and model evaluation
- Summary
- Modern Methods for Classification
- Machine learning for text
- Sentiment analysis as text classification
- Simple classifiers
- Optimizing simple classifiers
- Ensemble methods
- Getting the data
- Reading data
- Simple classifiers
- Logistic regression
- Removing stop words
- Increasing ngram range
- Multinomial Naive Bayes
- Adding TF-IDF
- Removing stop words
- Changing fit prior to false
- Support vector machines
- Decision trees
- Random forest classifier
- Extra trees classifier
- Optimizing our classifiers
- Parameter tuning using RandomizedSearch
- GridSearch
- Ensembling models
- Voting ensembles – Simple majority (aka hard voting)
- Voting ensembles – soft voting
- Weighted classifiers
- Removing correlated classifiers
- Summary
- Deep Learning for NLP
- What is deep learning?
- Differences between modern machine learning methods
- Understanding deep learning
- Puzzle pieces
- Model
- Loss function
- Optimizer
- Putting it all together – the training loop
- Kaggle – text categorization challenge
- Getting the data
- Exploring the data
- Multiple target dataset
- Why PyTorch?
- PyTorch and torchtext
- Data loaders with torchtext
- Conventions and style
- Knowing the field
- Exploring the dataset objects
- Iterators
- BucketIterator
- BatchWrapper
- Training a text classifier
- Initializing the model
- Putting the pieces together again
- Training loop
- Prediction mode
- Converting predictions into a pandas DataFrame
- Summary
- Building your Own Chatbot
- Why chatbots as a learning example?
- Why build a chatbot?
- Quick code means word vectors and heuristics
- Figuring out the right user intent
- Use case – food order bot
- Classifying user intent
- Bot responses
- Better response personalization
- Summary
- Web Deployments
- Web deployments
- Model persistence
- Model loading and prediction
- Flask for web deployments
- Summary
- Other Books You May Enjoy
- Leave a review - let other readers know what you think 更新時間:2021-06-10 18:37:01
推薦閱讀
- Learn ECMAScript(Second Edition)
- R語言數據分析從入門到精通
- ASP.NET MVC4框架揭秘
- SQL學習指南(第3版)
- PostgreSQL 11從入門到精通(視頻教學版)
- C語言實驗指導及習題解析
- TradeStation交易應用實踐:量化方法構建贏家策略(原書第2版)
- Spring Boot企業級項目開發實戰
- Unity UI Cookbook
- Mastering Data Mining with Python:Find patterns hidden in your data
- Django 5企業級Web應用開發實戰(視頻教學版)
- Oracle數據庫編程經典300例
- Oracle實用教程
- Manage Your SAP Projects with SAP Activate
- Scala編程(第4版)
- Learning RSLogix 5000 Programming
- Python機器學習技術:模型關系管理
- Go語言從入門到項目實戰(視頻版)
- Magento 2 Development Essentials
- Mastering Laravel
- Instant RubyMine Assimilation
- Ascend C異構并行程序設計:昇騰算子編程指南
- SignalR Blueprints
- Arduino Essentials
- Java Persistence with MyBatis 3
- Learning OpenStack High Availability
- Hands-On Functional Programming in Rust
- 整理優先:小改進,大回報,整潔代碼設計指南
- 馴服爛代碼:在編程操練中悟道
- Learning C# 7 By Developing Games with Unity 2017(Third Edition)