書名： PySpark Cookbook
作者名： Denny Lee Tomasz Drabas
本章字數： 79字
更新時間： 2021-06-18 19:06:36

Reading data from files

For this recipe, we will create an RDD by reading a local file in PySpark. To create RDDs in Apache Spark, you will need to first install Spark as noted in the previous chapter. You can use the PySpark shell and/or Jupyter notebook to run these code samples. Note that while this recipe is specific to reading local files, a similar syntax can be applied for Hadoop, AWS S3, Azure WASBs, and/or Google Cloud Storage:

官术网_书友最值得收藏!

PySpark Cookbook

Reading data from files