- Natural Language Processing Fundamentals
- Sohom Ghosh Dwight Gunning
- 138字
- 2021-06-11 13:42:31
Summary
In this chapter, you have learned about various types of data and ways to deal with unstructured text data. Text data is usually untidy and needs to be cleaned and pre-processed. Pre-processing steps mainly consist of tokenization, stemming, lemmatization, and stop-word removal. After pre-processing, features are extracted from texts using various methods, such as BoW and TF-IDF. This step converts unstructured text data into structured numeric data. New features are created from existing features using a technique called feature engineering. In the last part of the chapter, we explored various ways of visualizing text data, such as word clouds.
In the next chapter, you will learn how to develop machine learning models to classify texts using the features you have learned to extract in this chapter. Moreover, different sampling techniques and model evaluation parameters will be introduced.
- 高性能混合信號ARM:ADuC7xxx原理與應用開發
- TestStand工業自動化測試管理(典藏版)
- 城市道路交通主動控制技術
- CompTIA Network+ Certification Guide
- 運動控制器與交流伺服系統的調試和應用
- Android游戲開發案例與關鍵技術
- PostgreSQL 10 Administration Cookbook
- Red Hat Linux 9實務自學手冊
- 實用網絡流量分析技術
- 自動化生產線安裝與調試(三菱FX系列)(第二版)
- AVR單片機工程師是怎樣煉成的
- Hands-On Business Intelligence with Qlik Sense
- PHP求職寶典
- 教育創新與創新人才:信息技術人才培養改革之路(四)
- Learning Couchbase