- Natural Language Processing with Java and LingPipe Cookbook
- Breck Baldwin Krishna Dayanidhi
- 89字
- 2021-08-05 17:12:51
Introduction
An important part of building NLP systems is to work with the appropriate unit for processing. This chapter addresses the abstraction layer associated with the word level of processing. This is called tokenization, which amounts to grouping adjacent characters into meaningful chunks in support of classification, entity finding, and the rest of NLP.
LingPipe provides a broad range of tokenizer needs, which are not covered in this book. Look at the Javadoc for tokenizers that do stemming, Soundex (tokens based on what English words sound like), and more.
推薦閱讀
- Objective-C應(yīng)用開發(fā)全程實錄
- JavaScript Unlocked
- Magento 2 Development Cookbook
- VMware vSphere 6.7虛擬化架構(gòu)實戰(zhàn)指南
- Designing Hyper-V Solutions
- Python程序設(shè)計
- Visual Basic學(xué)習(xí)手冊
- JavaScript動態(tài)網(wǎng)頁開發(fā)詳解
- Java編程技術(shù)與項目實戰(zhàn)(第2版)
- 蘋果的產(chǎn)品設(shè)計之道:創(chuàng)建優(yōu)秀產(chǎn)品、服務(wù)和用戶體驗的七個原則
- Multithreading in C# 5.0 Cookbook
- Odoo 10 Implementation Cookbook
- Scratch趣味編程:陪孩子像搭積木一樣學(xué)編程
- Unity虛擬現(xiàn)實開發(fā)圣典
- Developing RESTful Web Services with Jersey 2.0