- Hands-On Machine Learning with ML.NET
- Jarred Capellman
- 193字
- 2021-06-24 16:43:25
Feature extraction and pipeline
Once your features and datasets have been obtained, the next step is to perform feature extraction. Feature extraction, depending on the size of your dataset and your features, could be one of the most time-consuming elements of the model building process.
For example, let's say that the results from the aforementioned fictitious John Doe County Election Poll had 40,000 responses. Each response was stored in a SQL database captured from a web form. Performing a SQL query, let's say you then returned all of the data into a CSV file, using which your model can be trained. At a high level, this is your feature extraction and pipeline. For more complex scenarios, such as predicting malicious web content or image classification, the extraction will include binary extraction of specific bytes in files. Properly storing this data to avoid having to re-run the extraction is crucial to iterating quickly (assuming the features did not change).
In Chapter 11, Training and Building Production Models, we will deep dive into ways to version your feature-extracted data and maintain control over your data, especially as your dataset grows in size.
- MySQL數(shù)據(jù)庫(kù)管理實(shí)戰(zhàn)
- 零基礎(chǔ)搭建量化投資系統(tǒng):以Python為工具
- 跟“龍哥”學(xué)C語言編程
- Mastering QGIS
- JavaFX Essentials
- Clojure for Domain:specific Languages
- Learning Neo4j 3.x(Second Edition)
- Kotlin Standard Library Cookbook
- BeagleBone Black Cookbook
- Android Wear Projects
- 微服務(wù)從小白到專家:Spring Cloud和Kubernetes實(shí)戰(zhàn)
- Lift Application Development Cookbook
- 零基礎(chǔ)學(xué)Python編程(少兒趣味版)
- 創(chuàng)意UI:Photoshop玩轉(zhuǎn)APP設(shè)計(jì)
- 貫通Tomcat開發(fā)