官术网_书友最值得收藏!

Loading the dataset

First of all, it is essential to download the dataset. Follow the preceding steps from the Technical requirements section and download the data. Gmail (https://takeout.google.com/settings/takeout) provides data in mbox format. For this chapter, I loaded my own personal email from Google Mail. For privacy reasons, I cannot share the dataset. However, I will show you different EDA operations that you can perform to analyze several aspects of your email behavior:

Let's load the required libraries:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
Note that for this analysis, we need to have the mailbox package installed. If it is not installed on your system, it can be added to your Python build using the  pip install mailbox instruction.

2.hen you have loaded the libraries, load the dataset:

import mailbox

mboxfile = "PATH TO DOWNLOADED MBOX FIL"
mbox = mailbox.mbox(mboxfile)
mbox

Note that it is essential that you replace the mbox file path with your own path.

The output of the preceding code is as follows:

<mailbox.mbox at 0x7f124763f5c0>

The output indicates that the mailbox has been successfully created.

3.ext, let's see the list of available keys:

for key in mbox[0].keys():
print(key)

The output of the preceding code is as follows:

X-GM-THRID
X-Gmail-Labels
Delivered-To
Received
X-Google-Smtp-Source
X-Received
ARC-Seal
ARC-Message-Signature
ARC-Authentication-Results
Return-Path
Received
Received-SPF
Authentication-Results
DKIM-Signature
DKIM-Signature
Subject
From
To
Reply-To
Date
MIME-Version
Content-Type
X-Mailer
X-Complaints-To
X-Feedback-ID
List-Unsubscribe
Message-ID

The preceding output shows the list of keys that are present in the extracted dataset. 

主站蜘蛛池模板: 双峰县| 达州市| 黎平县| 福安市| 大渡口区| 潮安县| 梧州市| 哈尔滨市| 建平县| 大兴区| 夏津县| 胶州市| 阿拉善右旗| 巫溪县| 皋兰县| 娄底市| 浙江省| 顺昌县| 徐汇区| 锦屏县| 太仓市| 同心县| 安丘市| 武鸣县| 肃宁县| 乌拉特中旗| 博罗县| 义马市| 金川县| 孝义市| 章丘市| 龙口市| 会理县| 视频| 蒲城县| 丰原市| 兴仁县| 四川省| 寿光市| 赣榆县| 阜城县|