- Go Machine Learning Projects
- Xuanyi Chew
- 186字
- 2021-06-10 18:46:38
The project
What we want to do is simple: given an email, is it kosher (which we call ham), or is it a spam email? We will be using the LingSpam database. The emails from that database are a little dated—spammers update their techniques and words all the time. However, I chose the LingSpam corpus for a good reason: it is already nicely preprocessed. The original scope of this chapter was to introduce the preprocessing of emails; however, the topic of preprocessing options for natural language is itself a topic for an entire book, so we will use a dataset that has already been preprocessed. This allows us to focus more on the mechanics of a very elegant algorithm.
Fear not, though, as I will actually walk through the brief basics of preprocessing. Be warned, however, that the level of complexity jumps up in a very steep curve, so be prepared to be sucked into a black hole of many hours on preprocessing natural language. At the end of this chapter, I will also recommend some libraries that will be useful for preprocessing.
- Visualforce Development Cookbook(Second Edition)
- 實時流計算系統設計與實現
- 圖形圖像處理(Photoshop)
- Effective DevOps with AWS
- 大數據時代的數據挖掘
- 傳感器技術應用
- 機器學習流水線實戰
- Embedded Programming with Modern C++ Cookbook
- Visual C++編程全能詞典
- 內模控制及其應用
- 氣動系統裝調與PLC控制
- AI的25種可能
- HBase Essentials
- FreeCAD [How-to]
- Hands-On Generative Adversarial Networks with Keras