- Natural Language Processing with Python Quick Start Guide
- Nirant Kasliwal
- 129字
- 2021-06-10 18:36:38
Bread and butter – most common tasks
There are several well-known text cleaning ideas. They have all made their way into the most popular tools today such as NLTK, Stanford CoreNLP, and spaCy. I like spaCy for two main reasons:
- It's an industry-grade NLP, unlike NLTK, which is mainly meant for teaching.
- It has good speed-to-performance trade-off. spaCy is written in Cython, which gives it C-like performance with Python code.
spaCy is actively maintained and developed, and incorporates the best methods available for most challenges.
By the end of this section, you will be able to do the following:
- Understand tokenization and do it manually yourself using spaCy
- Understand why stop word removal and case standardization works, with spaCy examples
- Differentiate between stemming and lemmatization, with spaCy lemmatization examples
推薦閱讀
- Microsoft Exchange Server PowerShell Cookbook(Third Edition)
- 編程的修煉
- Mastering OpenCV Android Application Programming
- 編寫高質量代碼:改善Python程序的91個建議
- Mastering AndEngine Game Development
- Swift細致入門與最佳實踐
- Visual FoxPro程序設計習題集及實驗指導(第四版)
- Python程序設計與算法基礎教程(第2版)(微課版)
- Kubernetes進階實戰(zhàn)
- Building Dynamics CRM 2015 Dashboards with Power BI
- R語言:邁向大數據之路(加強版)
- Node.js區(qū)塊鏈開發(fā)
- Getting Started with React VR
- R的極客理想:量化投資篇
- Puppet:Mastering Infrastructure Automation