- Mastering Java for Data Science
- Alexey Grigorev
- 133字
- 2021-07-02 23:44:33
Data Processing Toolbox
In the previous chapter, we discussed the best practices for approaching data science problems. We looked at CRISP-DM, which is the methodology for dealing with data mining projects, and one of the first steps there is data preprocessing. In this chapter, we will take a closer look at how to do this in Java.
Specifically, we will cover the following topics:
- Standard Java library
- Extensions to the standard library
- Reading data from different sources such as text, HTML, JSON, and databases
- DataFrames for manipulating tabular data
In the end, we will put everything together to prepare the data for the search engine.
By the end of this chapter, you will be able to process data such that it can be used for machine learning and further analysis.
推薦閱讀
- 從零開(kāi)始學(xué)Hadoop大數(shù)據(jù)分析(視頻教學(xué)版)
- 信息系統(tǒng)與數(shù)據(jù)科學(xué)
- MySQL基礎(chǔ)教程
- 深度剖析Hadoop HDFS
- Starling Game Development Essentials
- 智能數(shù)據(jù)時(shí)代:企業(yè)大數(shù)據(jù)戰(zhàn)略與實(shí)戰(zhàn)
- MySQL 8.x從入門(mén)到精通(視頻教學(xué)版)
- Python金融數(shù)據(jù)分析(原書(shū)第2版)
- Hadoop大數(shù)據(jù)開(kāi)發(fā)案例教程與項(xiàng)目實(shí)戰(zhàn)(在線實(shí)驗(yàn)+在線自測(cè))
- SAS金融數(shù)據(jù)挖掘與建模:系統(tǒng)方法與案例解析
- 聯(lián)動(dòng)Oracle:設(shè)計(jì)思想、架構(gòu)實(shí)現(xiàn)與AWR報(bào)告
- 2D 計(jì)算機(jī)視覺(jué):原理、算法及應(yīng)用
- Working with OpenERP
- 機(jī)器視覺(jué)原理與案例詳解
- Building Multicopter Video Drones