官术网_书友最值得收藏!

  • Apache Oozie Essentials
  • Jagat Jasjit Singh
  • 213字
  • 2021-07-30 09:58:20

Chapter 1. Setting up Oozie

Oozie is a workflow scheduler system to run Apache Hadoop jobs. Oozie Workflow jobs are Directed Acyclic Graphs (DAGs) of actions. More information on DAG can be found at https://en.wikipedia.org/wiki/Directed_acyclic_graph. Actions tell what to do in the job. Oozie supports running jobs of various types such as Java, Map-reduce, Pig, Hive, Sqoop, Spark, and Distcp. The output of one action can be consumed by the next action to create a chain sequence.

Oozie has client-server architecture, in which we install the server for storing the jobs and using client we submit our jobs to the server.

In this chapter, we will learn how to install Oozie for learning purpose and in production. For learning purposes, we will build Oozie from the source code, and for production we will use Hadoop distribution by Hortonworks. Throughout the book, we will use Hortonworks single node virtual machine. If you are using a different Hadoop distribution, you should not worry at all. All distribution packages are the same for Oozie software, which is made by the Apache community (http://oozie.apache.org).

After reading this chapter, we will be able to:

  • Configure Oozie in Hortonworks distribution using Ambari
  • Install Oozie using the source code provided as tar ball by the Apache Oozie website
主站蜘蛛池模板: 定西市| 甘孜| 鲁甸县| 广安市| 简阳市| 如东县| 贡山| 咸宁市| 城固县| 靖江市| 隆德县| 历史| 莆田市| 云阳县| 津南区| 横峰县| 中山市| 弥勒县| 东乌| 襄汾县| 济源市| 九龙县| 德昌县| 灌阳县| 探索| 隆德县| 广南县| 舞钢市| 海安县| 韩城市| 莱西市| 云林县| 恩平市| 万全县| 岳阳市| 宁晋县| 洛隆县| 六安市| 北票市| 临漳县| 蒙阴县|