Oozie - coordinator 잡 실행

Posted 02 17, 2014 20:06, Filed under: BigData/Oozie


# Oozie coordinator


> coordinator.xml : 특정 조건에 따른 주기적인 작업 설정
> workflow.xml : oozie를 이용해서 진행할 실제 작업 지정
> job.properties : oozie 잡(workflow), 조율기(coordinator)의 위치와 전역 변수 설정


# Oozie Docs
http://oozie.apache.org/docs/3.3.2/index.html

# 환경 설정
oozie 명령에 -oozie 옵션과 OOZIE_URL 정보 없이 간편하게 사용하기 위해서
$OOZIE_HOME/conf/oozie-env.sh 파일에 OOZIE_URL 부분 추가 또는 활성화
[hadoop@master /oozie-3.3.2/conf]$ more oozie-env.sh
export OOZIE_URL="http://master:11000/oozie"

> OOZIE_URL 설정 없을 경우 실행 예
oozie job -oozie http://master:11000/oozie -info 0000000-140217181224112-oozie-hdad-C

> 잡 아이디 형식
0000005-140217141522326-oozie-hdad-W (Workflow)
0000000-140217181224112-oozie-hdad-C (Coordinator)


# Oozie 명령어 옵션
[hadoop@master /oozie-3.3.2]$ oozie help
usage:
      the env variable 'OOZIE_URL' is used as default value for the '-oozie' option
      the env variable 'OOZIE_TIMEZONE' is used as default value for the '-timezone' option
      custom headers for Oozie web services can be specified using '-Dheader:NAME=VALUE'

      oozie help : display usage for all commands or specified command

      oozie version : show client version

      oozie job <OPTIONS> : job operations
                -action <arg>          coordinator rerun on action ids (requires -rerun);
                                       coordinator log retrieval on action ids (requires -log)
                -auth <arg>            select authentication type [SIMPLE|KERBEROS]
                -change <arg>          change a coordinator job
                -config <arg>          job configuration file '.xml' or '.properties'
                -configcontent <arg>   job configuration
                -coordinator <arg>     bundle rerun on coordinator names (requires -rerun)
                -D <property=value>    set/override value for given property
                -date <arg>            coordinator/bundle rerun on action dates (requires -rerun);
                                       coordinator log retrieval on action dates (requires -log)
                -debug                 Use debug mode to see debugging statements on stdout
                -definition <arg>      job definition
                -doas <arg>            doAs user, impersonates as the specified user
                -dryrun                Dryrun a workflow (since 3.3.2) or coordinator (since 2.0)
                                       job without actually executing it
                -filter <arg>          status=<S1>[;status=<S2>]* (All Coordinator actions
                                       satisfying any one of the status filters will be retreived.
                                       Currently, only supported for Coordinator job)
                -info <arg>            info of a job
                -kill <arg>            kill a job
                -len <arg>             number of actions (default TOTAL ACTIONS, requires -info)
                -localtime             use local time (same as passing your time zone to -timezone).
                                       Overrides -timezone option
                -log <arg>             job log
                -nocleanup             do not clean up output-events of the coordiantor rerun
                                       actions (requires -rerun)
                -offset <arg>          job info offset of actions (default '1', requires -info)
                -oozie <arg>           Oozie URL
                -refresh               re-materialize the coordinator rerun actions (requires
                                       -rerun)
                -rerun <arg>           rerun a job  (coordinator requires -action or -date, bundle
                                       requires -coordinator or -date)
                -resume <arg>          resume a job
                -run                   run a job
                -start <arg>           start a job
                -submit                submit a job
                -suspend <arg>         suspend a job
                -timezone <arg>        use time zone with the specified ID (default GMT).
                                       See 'oozie info -timezones' for a list
                -value <arg>           new endtime/concurrency/pausetime value for changing a
                                       coordinator job
                -verbose               verbose mode

      oozie jobs <OPTIONS> : jobs status
                 -auth <arg>       select authentication type [SIMPLE|KERBEROS]
                 -bulk <arg>       key-value pairs to filter bulk jobs response. e.g.
                                   bundle=<B>\;coordinators=<C>\;actionstatus=<S>\;startcreatedtime=
                                   <SC>\;endcreatedtime=<EC>\;startscheduledtime=<SS>\;endscheduledt
                                   ime=<ES>\; coordinators and actionstatus can be multiple comma
                                   separated valuesbundle and coordinators are 'names' of those
                                   jobs. Bundle name is mandatory, other params are optional
                 -doas <arg>       doAs user, impersonates as the specified user
                 -filter <arg>
                                   user=<U>\;name=<N>\;group=<G>\;status=<S>\;frequency=<F>\;unit=<M
                                   > (Valid unit values are 'months', 'days', 'hours' or 'minutes'.)
                 -jobtype <arg>    job type ('Supported in Oozie-2.0 or later versions ONLY -
                                   'coordinator' or 'bundle' or 'wf'(default))
                 -len <arg>        number of jobs (default '100')
                 -localtime        use local time (same as passing your time zone to -timezone).
                                   Overrides -timezone option
                 -offset <arg>     jobs offset (default '1')
                 -oozie <arg>      Oozie URL
                 -timezone <arg>   use time zone with the specified ID (default GMT).
                                   See 'oozie info -timezones' for a list
                 -verbose          verbose mode

      oozie admin <OPTIONS> : admin operations
                  -auth <arg>         select authentication type [SIMPLE|KERBEROS]
                  -doas <arg>         doAs user, impersonates as the specified user
                  -oozie <arg>        Oozie URL
                  -queuedump          show Oozie server queue elements
                  -status             show the current system status
                  -systemmode <arg>   Supported in Oozie-2.0 or later versions ONLY. Change oozie
                                      system mode [NORMAL|NOWEBSERVICE|SAFEMODE]
                  -version            show Oozie server build version

      oozie validate <ARGS> : validate a workflow XML file

      oozie sla <OPTIONS> : sla operations (Supported in Oozie-2.0 or later)
                -auth <arg>     select authentication type [SIMPLE|KERBEROS]
                -filter <arg>   filter of SLA events
                -len <arg>      number of results (default '100', max '1000')
                -offset <arg>   start offset (default '0')
                -oozie <arg>    Oozie URL

      oozie pig <OPTIONS> -X <ARGS> : submit a pig job, everything after '-X' are pass-through parameters to pig
                -auth <arg>           select authentication type [SIMPLE|KERBEROS]
                -config <arg>         job configuration file '.properties'
                -D <property=value>   set/override value for given property
                -doas <arg>           doAs user, impersonates as the specified user
                -file <arg>           Pig script
                -oozie <arg>          Oozie URL

      oozie info <OPTIONS> : get more detailed info about specific topics
                 -timezones   display a list of available time zones

      oozie mapreduce <OPTIONS> : submit a mapreduce job
                      -auth <arg>           select authentication type [SIMPLE|KERBEROS]
                      -config <arg>         job configuration file '.properties'
                      -D <property=value>   set/override value for given property
                      -doas <arg>           doAs user, impersonates as the specified user
                      -oozie <arg>          Oozie URL


# 잡 실행
> 로컬 job.properties 파일 위치 지정: examples/apps/http-download/job.properties
oozie job -oozie http://master:11000/oozie -config examples/apps/http-download/job.properties -run

[hadoop@master /oozie-3.3.2]$ oozie job -config examples/apps/http-download/job.properties -run
job: 0000000-140217141522326-oozie-hdad-C

# 실행된 잡 조회
[hadoop@master /oozie-3.3.2]$ oozie job -info 0000000-140217141522326-oozie-hdad-C
Job ID : 0000000-140217141522326-oozie-hdad-C
------------------------------------------------------------------------------------------------------------------------------------
Job Name    : http-download
App Path    : hdfs://master:8020/home02/hadoop/var/examples/apps/http-download
Status      : RUNNING
Start Time  : 2014-02-16 00:00 GMT
End Time    : 2016-11-29 00:00 GMT
Pause Time  : -
Concurrency : 1
------------------------------------------------------------------------------------------------------------------------------------
ID                                         Status    Ext ID                               Err Code  Created              Nominal Time        
0000000-140217141522326-oozie-hdad-C@1     KILLED    0000001-140217141522326-oozie-hdad-W -         2014-02-17 22:33 GMT 2014-02-16 00:00 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@2     KILLED    0000002-140217141522326-oozie-hdad-W -         2014-02-17 22:33 GMT 2014-02-16 00:05 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@3     KILLED    0000003-140217141522326-oozie-hdad-W -         2014-02-17 22:33 GMT 2014-02-16 00:10 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@4     KILLED    0000004-140217141522326-oozie-hdad-W -         2014-02-17 22:33 GMT 2014-02-16 00:15 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@5     KILLED    0000005-140217141522326-oozie-hdad-W -         2014-02-17 22:33 GMT 2014-02-16 00:20 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@6     KILLED    0000006-140217141522326-oozie-hdad-W -         2014-02-17 22:33 GMT 2014-02-16 00:25 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@7     KILLED    0000007-140217141522326-oozie-hdad-W -         2014-02-17 22:33 GMT 2014-02-16 00:30 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@8     KILLED    0000008-140217141522326-oozie-hdad-W -         2014-02-17 22:33 GMT 2014-02-16 00:35 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@9     KILLED    0000009-140217141522326-oozie-hdad-W -         2014-02-17 22:33 GMT 2014-02-16 00:40 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@10    KILLED    0000010-140217141522326-oozie-hdad-W -         2014-02-17 22:33 GMT 2014-02-16 00:45 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@11    KILLED    0000011-140217141522326-oozie-hdad-W -         2014-02-17 22:33 GMT 2014-02-16 00:50 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@12    KILLED    0000012-140217141522326-oozie-hdad-W -         2014-02-17 22:33 GMT 2014-02-16 00:55 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@13    KILLED    0000013-140217141522326-oozie-hdad-W -         2014-02-17 22:35 GMT 2014-02-16 01:00 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@14    KILLED    0000014-140217141522326-oozie-hdad-W -         2014-02-17 22:35 GMT 2014-02-16 01:05 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@15    KILLED    0000015-140217141522326-oozie-hdad-W -         2014-02-17 22:35 GMT 2014-02-16 01:10 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@16    KILLED    0000016-140217141522326-oozie-hdad-W -         2014-02-17 22:35 GMT 2014-02-16 01:15 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@17    KILLED    0000017-140217141522326-oozie-hdad-W -         2014-02-17 22:35 GMT 2014-02-16 01:20 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@18    RUNNING   0000018-140217141522326-oozie-hdad-W -         2014-02-17 22:35 GMT 2014-02-16 01:25 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@19    READY     -                                    -         2014-02-17 22:35 GMT 2014-02-16 01:30 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@20    READY     -                                    -         2014-02-17 22:35 GMT 2014-02-16 01:35 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@21    READY     -                                    -         2014-02-17 22:35 GMT 2014-02-16 01:40 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@22    READY     -                                    -         2014-02-17 22:35 GMT 2014-02-16 01:45 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@23    READY     -                                    -         2014-02-17 22:35 GMT 2014-02-16 01:50 GMT
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C@24    READY     -                                    -         2014-02-17 22:35 GMT 2014-02-16 01:55 GMT
------------------------------------------------------------------------------------------------------------------------------------


# 실행중인 Coordinator 조회
[hadoop@master /oozie-3.3.2]$ oozie jobs -jobtype coordinator
Job ID                                   App Name       Status    Freq Unit         Started                 Next Materialized      
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217181224112-oozie-hdad-C     http-download-hoursRUNNING   60   MINUTE       2014-02-17 18:00 GMT    2014-02-17 21:00 GMT   
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C     http-download  RUNNING   5    MINUTE       2014-02-16 00:00 GMT    2014-02-16 12:00 GMT   
-

# 실행중인 Coordinator 중지
[hadoop@master /oozie-3.3.2]$ oozie job -suspend 0000000-140217141522326-oozie-hdad-C

[hadoop@master /oozie-3.3.2]$ oozie jobs -jobtype coordinator
Job ID                                   App Name       Status    Freq Unit         Started                 Next Materialized      
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217181224112-oozie-hdad-C     http-download-hoursRUNNING   60   MINUTE       2014-02-17 18:00 GMT    2014-02-17 21:00 GMT   
------------------------------------------------------------------------------------------------------------------------------------
0000000-140217141522326-oozie-hdad-C     http-download  SUSPENDED 5    MINUTE       2014-02-16 00:00 GMT    2014-02-16 12:00 GMT   
------------------------------------------------------------------------------------------------------------------------------------


# coordinator.xml
> Frequency and Time-Period Representation

EL Constant Value Example
${coord:minutes(int n)} n ${coord:minutes(45)} --> 45
${coord:hours(int n)} n * 60 ${coord:hours(3)} --> 180
${coord:days(int n)} variable ${coord:days(2)} --> minutes in 2 full days from the current date
${coord:months(int n)} variable ${coord:months(1)} --> minutes in a 1 full month from the current date

<coordinator-app name="http-download-hours"
    frequency="${coord:hours(1)}"
    start="2014-02-17T18:00Z"
    end="2016-11-29T00:00Z" <!-- 잡의 종료 일자 -->
    timezone="UTC"
    xmlns="uri:oozie:coordinator:0.1">

  <controls>
    <concurrency>1</concurrency><!-- 동시에 실행할 수 있는 작업 흐름 개수 지정 -->
  </controls>

  <action>
    <workflow>
      <app-path>${nameNode}/home02/${coord:user()}/var/examples/apps/http-download</app-path>
      <configuration>
        <property>
          <name>jobTracker</name>
          <value>${jobTracker}</value>
        </property>
        <property>
          <name>nameNode</name>
          <value>${nameNode}</value>
        </property>
        <property>
          <name>queueName</name>
          <value>${queueName}</value>
        </property>
        <property>
          <name>inputData</name>
           <!-- 입력 파일명 -->
          <value>${nameNode}/home02/${coord:user()}/var/examples/apps/http-download/input/input-urls.txt</value>
        </property>
        <property>
          <name>outputData</name><!-- 출력 디렉토리 명 -->
          <value>${nameNode}/home02/${coord:user()}/var/examples/apps/http-download/output/${coord:formatTime(coord:nominalTime(), "yyyy/MM/dd")}</value>
        </property>
      </configuration>
    </workflow>
  </action>
</coordinator-app>


# workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.1" name="download-http">
  <start to="download-http"/>
  <action name="download-http">
    <map-reduce>
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>
      <prepare>
         <!-- 잡 실행하기 전에 출력 디렉터리를 삭제 -->
        <delete path="${outputData}"/>
      </prepare>
      <configuration>
        <property>
          <name>mapred.job.queue.name</name>
          <value>${queueName}</value>
        </property>
        <property>
          <name>mapred.mapper.class</name>
          <value>com.manning.hip.ch2.HttpDownloadMap</value><!-- 맵 클래스 -->
        </property>
        <property>
          <name>mapred.mapoutput.key.class</name>
          <value>org.apache.hadoop.io.Text</value>
        </property>
        <property>
          <name>mapred.mapoutput.value.class</name>
          <value>org.apache.hadoop.io.Text</value>
        </property>
        <property>
          <name>mapred.map.tasks</name>
          <value>1</value>
        </property>
        <property>
          <name>mapred.reduce.tasks</name>
          <value>0</value>
        </property>
        <property>
          <name>mapred.input.dir</name>
          <value>${inputData}</value><!-- 잡의 입력 디렉토리 -->
        </property>
        <property>
          <name>mapred.output.dir</name>
          <value>${outputData}</value><!-- 잡의 출력 디렉토리 -->
        </property>
      </configuration>
    </map-reduce>
    <ok to="end"/>
    <error to="fail"/>
  </action>
  <kill name="fail">
    <message>Map/Reduce failed, error
      message[${wf:errorMessage(wf:lastErrorNode())}]
    </message>
  </kill>
  <end name="end"/>
</workflow-app>


# job.properties
nameNode=hdfs://master:8020
jobTracker=master:8021
queueName=default

# coordinator 우지 조율기를 사용
# coordinator.xml, workflow.xml 파일의 HDFS 위치
oozie.coord.application.path=${nameNode}/home02/hadoop/var/examples/apps/http-download

# coordinator 없이, workflow 실행
#oozie.wf.application.path=${nameNode}/home02/hadoop/var/examples/apps/http-download



※ 위 내용은, 여러 자료를 참고하거나 제가 주관적으로 정리한 것입니다.
   잘못된 정보나 보완이 필요한 부분을, 댓글 또는 메일로 보내주시면 많은 도움이 되겠습니다.

"BigData / Oozie" 분류의 다른 글

Oozie - Job 실행시 에러 (0)2014/01/15
Oozie - 예제 테스트 (0)2014/01/15
Oozie - Client Demo (0)2014/01/15
02 17, 2014 20:06 02 17, 2014 20:06


Trackback URL : http://develop.sunshiny.co.kr/trackback/995

Leave a comment
[로그인][오픈아이디란?]
오픈아이디로만 댓글을 남길 수 있습니다


Recent Posts

  1. HDFS - Python Encoding 오류 처리
  2. HP - Vertica ROS Container 관련 오류...
  3. HDFS - Hive 실행시 System Time 오류
  4. HP - Vertica 사용자 쿼리 이력 테이블...
  5. Client에서 HDFS 환경의 데이터 처리시...

Recent Comments

  1. 안녕하세요^^ 배그핵
  2. 안녕하세요^^ 도움이 되셨다니, 저... sunshiny
  3. 정말 큰 도움이 되었습니다.. 감사합... 사랑은
  4. 네, 안녕하세요. 댓글 남겨 주셔서... sunshiny
  5. 감사합니다 많은 도움 되었습니다!ㅎㅎ 프리시퀸스

Recent Trackbacks

  1. tenant improvement contractor tenant improvement contractor 30 03
  2. construction management experts construction management experts 30 03
  3. Going Here Going Here 30 03
  4. cabo san lucas vacation rentals cabo san lucas vacation rentals 30 03
  5. los cabos los cabos 30 03

Calendar

«   03 2020   »
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31        

Bookmarks

  1. 위키피디아
  2. MysqlKorea
  3. 오라클 클럽
  4. API - Java
  5. Apache Hadoop API
  6. Apache Software Foundation
  7. HDFS 생태계 솔루션
  8. DNSBL - Spam Database Lookup
  9. Ready System
  10. Solaris Freeware
  11. Linux-Site
  12. 윈디하나의 솔라나라

Site Stats

TOTAL 2897193 HIT
TODAY 221 HIT
YESTERDAY 1376 HIT