官术网_书友最值得收藏!

Summary

In this chapter, we introduced you to Spark SQL, SparkSession (primary entry point to Spark SQL), and Spark SQL interfaces (RDDs, DataFrames, and Dataset). We then described some of the internals of Spark SQL, including the Catalyst and Project Tungsten-based optimizations. Finally, we explored how to use Spark SQL in streaming applications and the concept of Structured Streaming. The primary goal of this chapter was to give you an overview of Spark SQL while getting you comfortable with the Spark environment through hands-on sessions (using public Datasets).

In the next chapter, we will get into the details of using Spark SQL to explore structured and semi-structured data typical to big data applications.

主站蜘蛛池模板: 白山市| 临邑县| 长武县| 于都县| 吉木乃县| 南平市| 江西省| 岐山县| 图木舒克市| 清苑县| 三河市| 如皋市| 吉木萨尔县| 张家口市| 龙里县| 禹城市| 沂南县| 南雄市| 兴城市| 东山县| 项城市| 循化| 灌阳县| 从江县| 台江县| 农安县| 阜阳市| 正镶白旗| 巴楚县| 六枝特区| 滨州市| 驻马店市| 湘潭县| 屯昌县| 灵璧县| 桓仁| 霍州市| 平塘县| 尼玛县| 武平县| 开鲁县|