- Learning Spark SQL
- Aurobindo Sarkar
- 170字
- 2021-07-02 18:23:51
Executing other miscellaneous processing steps
If required we can choose to execute a few more steps to help cleanse the data further, study more aggregations, or to convert to a typesafe data structure, and so on.
We can drop the time column and aggregate the values in various columns using aggregation functions such as sum and average on the values of each day's readings. Here, we rename the columns with a d prefix to represent daily values.

We display a few sample records from this DataFrame:
scala> finalDayDf1.show(5)

Here, we group the readings by year and month, and then count the number of readings and display them for each of the months. The first month's number of readings is low as the data was captured in half a month.

We can also convert our DataFrame to a Dataset using a case class, as follows:

At this stage, we have completed all the steps for pre-processing the household electric consumption Dataset. We now shift our focus to processing the weather Dataset.
- 一步一步學Spring Boot 2:微服務項目實戰
- 從零構建知識圖譜:技術、方法與案例
- Android項目開發入門教程
- SpringMVC+MyBatis快速開發與項目實戰
- C# 從入門到項目實踐(超值版)
- Spring Boot+Spring Cloud+Vue+Element項目實戰:手把手教你開發權限管理系統
- 深度強化學習算法與實踐:基于PyTorch的實現
- JavaScript 程序設計案例教程
- bbPress Complete
- The DevOps 2.5 Toolkit
- Python極簡講義:一本書入門數據分析與機器學習
- AIRIOT物聯網平臺開發框架應用與實戰
- Mastering C++ Multithreading
- Hands-On Full Stack Development with Spring Boot 2.0 and React
- HoloLens與混合現實開發