書名： Natural Language Processing with Python Quick Start Guide
作者名： Nirant Kasliwal
本章字數： 155字
更新時間： 2021-06-10 18:36:36

Example – text classification workflow

The preceding process is fairly generic. What would it look like for one of the most common natural language applications – text classification?

The following flow diagram was built by Microsoft Azure, and is used here to explain how their own technology fits directly into our workflow template. There are several new words that they have introduced to feature engineering, such as unigrams, TF-IDF, TF, n-grams, and so on:

The main steps in their flow diagram are as follows:

Step 1: Data preparation
Step 2: Text pre-processing
Step 3: Feature engineering:
- Unigrams TF-IDF extraction
- N-grams TF extraction
Step 4: Train and evaluate models
Step 5: Deploy trained models as web services

This means that it's time to stop talking and start programming. Let's quickly set up the environment first and then we will work on building our first text classification system in 30 lines of code or less.

官术网_书友最值得收藏!

Natural Language Processing with Python Quick Start Guide

Example – text classification workflow