Java applications is one amongst the list of jobs that can be run as a part of the Oozie workflow. Here, I focus on how to create an oozie workflow that executes a java action. The oozie Java application folder has three components :
- lib folder
- workflow.xml file
- job.properties file
We shall take them one by one :
Steps to create the 'lib' folder :
The lib folder should consist of all the jars/files required to compile your java class and another jar that we would be creating here.
- To start with place your .java file in a directory structure mapping the package it belongs to,(For eg. place Fetch.java in /com/jayati/sampleapp/Fetch.java, if Fetch.java belongs to package com.jayati.sampleapp;) and compile it
- Create a folder with the desired application name (assuming appName). And create a lib folder in it. Copy the directory structure created in the step 1 to appName/lib. Remove the .java file from it. So now we have /appName/lib/com/jayati/sampleapp/Fetch.class
- Place all the jars/files that were required to compile your java class in the lib folder parallel to the com folder.
Workflow.xml
The workflow.xml defines a sequence of actions that would be executed in the workflow. In this example, we have just one java action to be executed.
In case of a java action, we need to specify the job tracker, name node java class name. Assuming the action name as 'java-node', the .xml file would look like :
<workflow-app xmlns="uri:oozie:workflow:0.1" name="appName-wf"> <start to="java-node"/> <action name="java-node"> <java> <job-tracker>localhost:9001</job-tracker> <name-node>hdfs://localhost:9000</name-node> <configuration> <property> <name>mapred.job.queue.name</name> <value>default</value> </property> </configuration> <main-class>com.jayati.sampleapp.Fetch</main-class> <java-opts>-Denv=stg -DPP=DB_PASSPHRASE</java-opts> </java> <ok to="end"/> <error to="fail"/> </action> <kill name="fail"> <message>Java failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end"/> </workflow-app> |
You'll need to replace the jobTracker and nameNode port numbers in case they differ as per your hadoop configuration and then place this xml in appName/ folder.
job.properties
This file lists down values of all the variables used in workflow.xml such as jobTracker, nameNode etc. but since we have used direct values, our job.properties would consist of a single line of content and would look like :
oozie.wf.application.path=hdfs://localhost:9000/hadoopfs_path/appName |
where hadoopfs_path is the path of the folder in hdfs where this application folder would be placed. Copy the above file to appName/
This finishes the building of a workflow containing one java action and to run this application on oozie you can refer one of my previous blogs, "Try On Oozie".