官术网_书友最值得收藏!

Example – text classification workflow

The preceding process is fairly generic. What would it look like for one of the most common natural language applications text classification?

The following flow diagram was built by Microsoft Azure, and is used here to explain how their own technology fits directly into our workflow template. There are several new words that they have introduced to feature engineering, such as unigrams, TF-IDF, TF, n-grams, and so on:

The main steps in their flow diagram are as follows:

  1. Step 1: Data preparation
  2. Step 2: Text pre-processing
  3. Step 3: Feature engineering:
    • Unigrams TF-IDF extraction
    • N-grams TF extraction
  4. Step 4: Train and evaluate models
  5. Step 5: Deploy trained models as web services

This means that it's time to stop talking and start programming. Let's quickly set up the environment first and then we will work on building our first text classification system in 30 lines of code or less.

主站蜘蛛池模板: 吐鲁番市| 海兴县| 革吉县| 柳江县| 通州区| 安福县| 沭阳县| 八宿县| 平江县| 大同市| 湄潭县| 井研县| 八宿县| 昌平区| 宜良县| 萝北县| 天津市| 通州区| 南丹县| 墨玉县| 西乌珠穆沁旗| 海南省| 万载县| 昌乐县| 镇坪县| 县级市| 肃北| 莱西市| 调兵山市| 禄劝| 江永县| 虹口区| 凉山| 肇庆市| 古丈县| 德格县| 吴江市| 太仆寺旗| 河曲县| 涞水县| 亳州市|