官术网_书友最值得收藏!

Data cleansing

 Let's create a CSV file with only the required fields. Let's start with the following steps:

Import the csv package:

import csv

2.reate a CSV file with only the required attributes:

with open('mailbox.csv', 'w') as outputfile:
writer = csv.writer(outputfile)
writer.writerow(['subject','from','date','to','label','thread'])

for message in mbox:
writer.writerow([
message['subject'],
message['from'],
message['date'],
message['to'],
message['X-Gmail-Labels'],
message['X-GM-THRID']
]
)

The preceding output is a csv file named mailbox.csv. Next, instead of loading the mbox file, we can use the CSV file for loading, which will be smaller than the original dataset.

主站蜘蛛池模板: 会昌县| 尖扎县| 阿克| 涞水县| 南溪县| 霞浦县| 和政县| 舒兰市| 若尔盖县| 隆尧县| 丰县| 重庆市| 宁海县| 山阳县| 富锦市| 长宁县| 东莞市| 亚东县| 策勒县| 佛教| 蒙自县| 阿瓦提县| 永川市| 富锦市| 河北区| 南漳县| 西青区| 闵行区| 托克逊县| 东兰县| 双城市| 乾安县| 密云县| 南江县| 邵阳县| 台州市| 永城市| 宜都市| 民勤县| 邯郸市| 浦城县|