- Hands-On Exploratory Data Analysis with Python
- Suresh Kumar Mukhiya Usman Ahmed
- 248字
- 2021-06-24 16:44:55
Loading the dataset
First of all, it is essential to download the dataset. Follow the preceding steps from the Technical requirements section and download the data. Gmail (https://takeout.google.com/settings/takeout) provides data in mbox format. For this chapter, I loaded my own personal email from Google Mail. For privacy reasons, I cannot share the dataset. However, I will show you different EDA operations that you can perform to analyze several aspects of your email behavior:
Let's load the required libraries:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
2.hen you have loaded the libraries, load the dataset:
import mailbox
mboxfile = "PATH TO DOWNLOADED MBOX FIL"
mbox = mailbox.mbox(mboxfile)
mbox
Note that it is essential that you replace the mbox file path with your own path.
The output of the preceding code is as follows:
<mailbox.mbox at 0x7f124763f5c0>
The output indicates that the mailbox has been successfully created.
3.ext, let's see the list of available keys:
for key in mbox[0].keys():
print(key)
The output of the preceding code is as follows:
X-GM-THRID
X-Gmail-Labels
Delivered-To
Received
X-Google-Smtp-Source
X-Received
ARC-Seal
ARC-Message-Signature
ARC-Authentication-Results
Return-Path
Received
Received-SPF
Authentication-Results
DKIM-Signature
DKIM-Signature
Subject
From
To
Reply-To
Date
MIME-Version
Content-Type
X-Mailer
X-Complaints-To
X-Feedback-ID
List-Unsubscribe
Message-ID
The preceding output shows the list of keys that are present in the extracted dataset.
- 數據庫系統教程(第2版)
- Learn Type:Driven Development
- PHP程序設計(慕課版)
- Oracle 12c中文版數據庫管理、應用與開發實踐教程 (清華電腦學堂)
- Java開發入行真功夫
- Spring實戰(第5版)
- 深入淺出RxJS
- Mastering RStudio:Develop,Communicate,and Collaborate with R
- Python數據可視化之Matplotlib與Pyecharts實戰
- Android Native Development Kit Cookbook
- Serverless架構
- 利用Python進行數據分析
- 深入實踐Kotlin元編程
- Machine Learning With Go
- Python 3 數據分析與機器學習實戰