Slicing and dicing datasets

Our first example is to pick all stocks listed on the NYSE by using an R dataset called marketCap.Rdata, shown in the code here:

> con<-url("http://canisius.edu/~yany/RData/marketCap.RData") 
> load(con) 
> head(.marketCap)

The associated output is shown here:

> head(.marketCap) 
  Symbol                       Name MarketCap Exchange 
1      A Agilent Technologies, Inc. $12,852.3     NYSE 
2     AA                 Alcoa Inc. $28,234.5     NYSE 
3   AA-P                 Alcoa Inc.     $43.6     AMEX
4    AAC       Ableauctions.Com Inc      $4.3     AMEX 
5    AAI     AirTran Holdings, Inc.    $156.9     NYSE 
6    AAP     Advance Auto Parts Inc  $3,507.4     NYSE

We have various ways to choose a subset of the R dataset called .marketCap. Note that there is a dot in front of .marketCap:

a<-.marketCap[1]      # choose the 1st column  
b<-.marketCap$SYMBOL  # another way to choose the 1st column  
c<-.marketCap[,1:2]   # choose the first two columns  
d<-subset(.marketCap,.marketCap$EXCHANGE=="NYSE") 
e<-subset(head(.marketCap)) 
f<-subset(.marketCap,.marketCap$MARKET>200 & .marketCap$MARKETCAP<=3000)

A Python dataset is downloadable at http://canisius.edu/~yany/python/marketCap.pkl.

官术网_书友最值得收藏!

Hands-On Data Science with Anaconda

Slicing and dicing datasets