- Natural Language Processing with Java and LingPipe Cookbook
- Breck Baldwin Krishna Dayanidhi
- 89字
- 2021-08-05 17:12:51
Introduction
An important part of building NLP systems is to work with the appropriate unit for processing. This chapter addresses the abstraction layer associated with the word level of processing. This is called tokenization, which amounts to grouping adjacent characters into meaningful chunks in support of classification, entity finding, and the rest of NLP.
LingPipe provides a broad range of tokenizer needs, which are not covered in this book. Look at the Javadoc for tokenizers that do stemming, Soundex (tokens based on what English words sound like), and more.
推薦閱讀
- Google Apps Script for Beginners
- Angular UI Development with PrimeNG
- Interactive Data Visualization with Python
- Raspberry Pi for Secret Agents(Third Edition)
- 數(shù)據結構與算法JavaScript描述
- Python高級機器學習
- Hadoop+Spark大數(shù)據分析實戰(zhàn)
- Instant QlikView 11 Application Development
- 基于Swift語言的iOS App 商業(yè)實戰(zhàn)教程
- Learning Concurrent Programming in Scala
- Angular開發(fā)入門與實戰(zhàn)
- Java網絡編程核心技術詳解(視頻微課版)
- Visual Basic程序設計教程
- CoffeeScript Application Development Cookbook
- 新一代SDN:VMware NSX 網絡原理與實踐