- Fast Data Processing with Spark 2(Third Edition)
- Krishna Sankar
- 238字
- 2021-08-20 10:27:11
Building Spark applications
Using Spark in an interactive mode with the Spark shell is very good for quick prototyping; however for developing applications, we need an IDE. The choices for the Spark IDE have come a long way since the days of Spark 1.0. One can use an array of the Spark IDEs for developing algorithms, data wrangling (that is, exploring data), and modeling analytics applications. As a general rule of thumb, iPython and Zeppelin are used for data exploration IDEs. The language of choice for iPython is Python and Scala/Java for Zeppelin. This is a general observation; all of them can handle the major languages; Scala, Java, Python, and SQL. For developing Scala and Java, the preferred IDE is Eclipse and IntelliJ. We will mostly use the Spark shell (and occasionally iPython) in this book, as our focus is data wrangling and understanding the Spark APIs. Of course, deploying Spark applications require compiling for Java and Scala.
Building the Spark jobs is a bit trickier than building a normal application as all dependencies have to be available on all the machines that are in your cluster.
In this chapter, we will first look at iPython and Eclipse, and then cover the process of building a Java and Scala Spark job with Maven, and learn to build the Spark jobs with a non-Maven aware build system. A reference website for building Spark is at http://spark.apache.org/docs/latest/building-spark.html.
- Rust編程:入門、實戰與進階
- Visual Basic編程:從基礎到實踐(第2版)
- Java從入門到精通(第4版)
- Instant QlikView 11 Application Development
- 實戰Java高并發程序設計(第3版)
- iOS編程基礎:Swift、Xcode和Cocoa入門指南
- 程序員修煉之道:通向務實的最高境界(第2版)
- 碼上行動:用ChatGPT學會Python編程
- Getting Started with Greenplum for Big Data Analytics
- Mastering ROS for Robotics Programming
- R數據科學實戰:工具詳解與案例分析
- JavaScript應用開發實踐指南
- 創意UI:Photoshop玩轉APP設計
- Java程序設計教程
- C#程序開發教程