官术网_书友最值得收藏!

Accessing Facebook data

Social network data is another great source for the user who is interested in exploring and analyzing social interactions. The main difference between social network data and web data is that social network platforms often provide a semi-structured data format (mostly JSON). Thus, one can easily access the data without the need to inspect how the data is structured. In this recipe, we will illustrate how to use rvest and rson to read and parse data from Facebook.

Getting ready

In this recipe, you need to prepare your environment with R installed and a computer that can access the Internet.

How to do it…

Perform the following steps to access data from Facebook:

  1. First, we need to log in to Facebook and access the developer page (https://developers.facebook.com/):

    Figure 18: Accessing the Facebook developer page

  2. Click on Tools & Support, and select Graph API Explorer:

    Figure 19: Selecting the Graph API Explorer

  3. Next, click on Get Token, and choose Get Access Token:

    Figure 20: Selecting the Get Access Token

  4. On the User Data Permissions pane, select user_tagged_places and then click Get Access Token:

    Figure 21: Selecting permissions

  5. Copy the generated access token to the clipboard:

    Figure 22: Copying the access token

  6. Try to access the Facebook API by using rvest:
    > access_token <- '<access_token>'
    > fb_data <- html(sprintf("https://graph.facebook.com/me/tagged_places?access_token=%s",access_token))
    
  7. Install and load the rjson package:
    > install.packages("rjson")
    > library(rjson)
    
  8. Extract the text from fb_data and then use fromJSON to read the JSON data:
    > fb_json <- fromJSON(fb_data %>% html_text())
    
  9. Use sapply to extract the name and ID of the place from fb_json:
    > fb_place <- sapply(fb_json$data, function(e){e$place$name})
    > fb_id <- sapply(fb_json$data, function(e){e$place$id})
    
  10. Last, use data.frame to wrap the data:
    > data.frame(place = fb_place, id = fb_id)
    

How it works…

In this recipe, we cover how to retrieve social network data through Facebook's Graph API. Unlike scraping web pages, you need to obtain a Facebook access token before making any request for insight information. There are two ways to retrieve the access token: one is to use Facebook's Graph API Explorer, and the other is to create a Facebook application. In this recipe, we illustrate how to use the Graph API Explorer to obtain the access token.

Facebook's Graph API Explorer is where you can craft your requests URL to access Facebook data on your behalf. To access the Explorer page, we first visit Facebook's developer page (https://developers.facebook.com/). The Graph API Explorer page is under the drop-down menu for Tools & Support. After going to the Explorer page, we select Get Access Token from the drop-down menu for Get Token. Subsequently, a tabbed window will appear; you can check access permission to various levels of the application. For example, we can check tagged_places to access the locations we have previously tagged. After we have selected the permissions we require, we can click on Get Access Token to allow Graph API Explorer to access our insight data. After completing these steps, you will see an access token, which is a temporary, short-lived token that you can use to access the Facebook API.

With the access token, we can then access the Facebook API with R. First, we need a HTTP request package. Similar to the web scraping recipe, we can use the rvest package to make the request. We craft a request URL with the addition of the access_token (copied from Graph API Explorer) to the Facebook API. From the response, we should receive JSON formatted data. To read the attributes of JSON formatted data, we install and load the RJSON package. We can then use the fromJSON function to read the JSON format string extracted from the response.

Lastly, we read places and ID information through the use of the sapply function, and we can then use data frame to transform extracted information into the data frame. At the end of the recipe, we should see data formatted in the data frame.

There's more…

To learn more about Graph API, read the following official documentation from Facebook: https://developers.facebook.com/docs/reference/api/field_expansion/.

  1. First, we need to install and load the Rfacebook package:
    > install.packages("Rfacebook")
    > library(Rfacebook)
    
  2. We can then use built-in functions to retrieve data from the user or access similar information with the provision of an access token:
    > getUsers("me", "<access_token>")
    

If you would like to scrape public fan pages without logging in to Facebook every time, you can create a Facebook app to access insight information on behalf of the app:

  1. To create an authorized app token, log in to the Facebook developer page and click on Add a New Page:

    Figure 23: Creating a new app

  2. You can create a new Facebook app with any name and a valid e-mail ID, providing that it has not already been registered:

    Figure 24: Creating a new app ID

  3. Next, you can copy both the app ID and app secret and craft the access token to <APP ID>|<APP SECRET>. You can now use this token to scrape public fan page information with Graph API:

    Figure 25: Obtaining the app ID and secret

  4. Similar to Rfacebook, we can then replace the access_token with <APP ID>|<APP SECRET>:
    > getUsers("me", "<access_token>")
    
主站蜘蛛池模板: 阳西县| 会泽县| 玉环县| 丹巴县| 揭东县| 胶州市| 明水县| 清水河县| 呼伦贝尔市| 习水县| 济阳县| 建昌县| 广安市| 中西区| 图木舒克市| 聂荣县| 望江县| 南川市| 五峰| 界首市| 商南县| 衡阳县| 昔阳县| 石门县| 道孚县| 富平县| 阿拉善左旗| 凯里市| 玉龙| 民乐县| 阿瓦提县| 武川县| 江阴市| 上蔡县| 山丹县| 丰原市| 阿图什市| 元谋县| 囊谦县| 静海县| 伊金霍洛旗|