- Hands-On Data Science with Anaconda
- Dr. Yuxing Yan James Yan
- 85字
- 2021-06-25 21:08:51
Generating Python datasets
To generate a Python dataset, we use the Pandas to_pickle functionality. The dataset we plan to use is called adult.pkl, as shown in the following screenshot:

The related Python code is given here:
import pandas as pd path="http://archive.ics.uci.edu/ml/machine-learning-databases/" dataSet="adult/adult.data" inFile=path+dataSet x=pd.read_csv(inFile,header=None) adult=pd.DataFrame(x,index=None) adult= adult.rename(columns={0:'age',1: 'workclass', 2:'fnlwgt',3:'education',4:'education-num', 5:'marital-status',6:'occupation',7:'relationship', 8:'race',9:'sex',10:'capital-gain',11:'capital-loss', 12:'hours-per-week',13:'native-country',14:'class'}) adult.to_pickle("c:/temp/adult.pkl")
To show the first several lines of observations, we use the x.head() functionality, shown in the following screenshot:

Note that the backup dataset is available at the author's website, downloadable at http://canisius.edu/~yany/data/adult.data.txt.
推薦閱讀
- Microsoft Dynamics CRM Customization Essentials
- 網上沖浪
- 控制與決策系統仿真
- INSTANT Varnish Cache How-to
- 傳感器技術應用
- 機器學習流水線實戰
- 數據通信與計算機網絡
- Docker High Performance(Second Edition)
- Apache Superset Quick Start Guide
- 突破,Objective-C開發速學手冊
- PVCBOT機器人控制技術入門
- 深度學習與目標檢測
- Mastering GitLab 12
- Mastering Exploratory Analysis with pandas
- 筆記本電腦電路分析與故障診斷