- Python:Advanced Predictive Analytics
- Ashish Kumar Joseph Babcock
- 349字
- 2021-07-02 20:09:21
Case 3 – reading data from a URL
Several times, we need to read the data directly from a web URL. This URL might contain the data written in it or might contain a file which has the data. For example, navigate to this website, http://winterolympicsmedals.com/ which lists the medals won by various countries in different sports during the Winter Olympics. Now type the following address in the URL address bar: http://winterolympicsmedals.com/medals.csv.
A CSV file will be downloaded automatically. If you choose to download it manually, saving it and then specifying the directory path for the read_csv
method is a time consuming process. Instead, Python allows us to read such files directly from the URL. Apart from the significant saving in time, it is also beneficial to loop over the files when there are many such files to be downloaded and read in.
A simple read_csv
statement is required to read the data directly from the URL:
import pandas as pd medal_data=pd.read_csv('http://winterolympicsmedals.com/medals.csv')
Alternatively, to work with URLs to get data, one can use a couple of Python packages, which we have not used till now, that is csv
and urllib
. The readers can go to the documentation of the packages to learn more about these packages. It is sufficient to know that csv
provides a range of methods to handle the CSV files, while urllib
is used to navigate and access information from the URL. Here is how it can be done:
import csv import urllib2 url='http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' response=urllib2.urlopen(url) cr=csv.reader(response) for rows in cr: print rows
The working of the preceding code snippet can be explained in the following two points:
- The
urlopen
method of theurllib2
library creates a response that can be read in using thereader
method of thecsv
library. - This instance is an iterator and can be iterated over its rows.
The csv
module is very helpful in dealing with CSV files. It can be used to read the dataset row by row, or in other words, iterate over the dataset among other things. It can be used to write to CSV files as well.
- 我們都是數(shù)據(jù)控:用大數(shù)據(jù)改變商業(yè)、生活和思維方式
- 虛擬化與云計算
- 區(qū)塊鏈通俗讀本
- Apache Kylin權(quán)威指南
- 企業(yè)級容器云架構(gòu)開發(fā)指南
- Google Cloud Platform for Developers
- 區(qū)塊鏈技術(shù)應(yīng)用與實踐案例
- Visual Studio 2013 and .NET 4.5 Expert Cookbook
- 從實踐中學(xué)習(xí)sqlmap數(shù)據(jù)庫注入測試
- 數(shù)據(jù)分析思維:產(chǎn)品經(jīng)理的成長筆記
- 數(shù)據(jù)挖掘與機(jī)器學(xué)習(xí)-WEKA應(yīng)用技術(shù)與實踐(第二版)
- 代碼的未來
- 數(shù)據(jù)分析方法及應(yīng)用:基于SPSS和EXCEL環(huán)境
- 實用預(yù)測分析
- 大數(shù)據(jù)原理與技術(shù)