Hive 0.13 버전부터 ACID(Atomicity, Consistency, Isolation, Durability)를 지원 하고,
0.14 버전부터 Update, Delete를 포함한 DML을 지원하게 됨.

아래와 같이 hive-site.xml 설정을 변경 한후 Hive 세션을 다시 시작하면 DML 작업 진행 가능.
단, ORC 파일 형식의 테이블에서만 지원

hive-site.xml
hive.support.concurrency=true
hive.enforce.bucketing=true
hive.exec.dynamic.partition.mode=nonstrict
hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
hive.compactor.initiator.on=true
hive.compactor.worker.threads=2

Test Table Script
-- 테이블 생성
CREATE TABLE student
(
std_id INT,
std_name STRING,
age INT,
address STRING
)
CLUSTERED BY (address) into 3 buckets
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED as ORC tblproperties('transactional'='true');

-- 샘플 데이터 입력
INSERT INTO TABLE student VALUES
(101,'JAVACHAIN',30,'PAUL REVERE RD'),
(102,'ANTO',18,'29 NATHAN HALE'),
(103,'PRABU',23,'34 henry road'),
(104,'KUMAR',24,'gandhi road'),
(105,'jack',35,'Modi street');

-- 조회
SELECT * FROM student;

-- 업데이트
UPDATE student SET std_id = 110 WHERE std_id = 105;

-- 삭제
DELETE FROM STUDENT where std_id=105;




Use transaction functionality with example.

Once you have installed and configured Hive , create simple table :

hive>create table testTable(id int,name string)row format delimited fields terminated by ',';
Then, try to insert few rowsin test table.
hive>insert into table testTable values (1,'row1'),(2,'row2');
Now try to delete records , you just inserted in table.
hive>delete from testTable where id = 1;
Error!
FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.

By default transactions are configured to be off. It is been said that update is not supported with  the delete operation used in the conversion manager. To support update/delete , you must change following configuration.

        cd  $HIVE_HOME
        vi conf/hive-site.xml

 <property>
  <name>hive.support.concurrency</name>
  <value>true</value>
 </property>
 <property>
  <name>hive.enforce.bucketing</name>
  <value>true</value>
 </property>
 <property>
  <name>hive.exec.dynamic.partition.mode</name>
  <value>nonstrict</value>
 </property>
 <property>
  <name>hive.txn.manager</name>
  <value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
 </property>
 <property>
  <name>hive.compactor.initiator.on</name>
  <value>true</value>
 </property>
 <property>
  <name>hive.compactor.worker.threads</name>
  <value>2</value>
 </property>


Restart the service and then try delete command again :

Error!
FAILED: LockException [Error 10280]: Error communicating with the metastore.
There is problem with metastore. In order to use insert/update/delete operation, You need to change following configuration in conf/hive-site.xml as feature is currently in development.
 <property>
  <name>hive.in.test</name>
  <value>true</value>
 </property>
Restart the service and then delete command again :
hive>delete from testTable where id = 1;
Error!
FAILED: SemanticException [Error 10297]: Attempt to do update or delete on table default.testTable that does not use an AcidOutputFormat or is not bucketed. 
Only ORC file format is supported in this first release.  The feature has been built such that transactions can be used by any storage format that can determine how updates or deletes apply to base records (basically, that has an explicit or implicit row id), but so far the integration work has only been done for ORC.

Tables must be bucketed to make use of these features.  Tables in the same system not using transactions and ACID do not need to be bucketed.

See below built table example with ORCFileformat, bucket enabled and ('transactional'='true').
hive>create table testTableNew(id int ,name string ) clustered by (id) into 2 buckets 
         stored as orc TBLPROPERTIES('transactional'='true');
Insert :
hive>insert into table testTableNew values (1,'row1'),(2,'row2'),(3,'row3');
Update :
hive>update testTableNew set name = 'updateRow2' where id = 2;
Delete :
hive>delete from testTableNew where id = 1;
Test :
hive>select * from testTableNew ;

By this point, you must be getting only 2 records on table testTableNew with id 2 and 3. Records with id 2 must have updated name.
If everything goes well without any error, your hive setup is up and running with transaction enabled for your experiment.

Hoping that tutorial helped you in hive set-up for transactional support. Stay tuned for more tutorial on Apache Hive.

For More , refer Hive+Transactions.


※ 위 내용은, 여러 자료를 참고하거나 제가 주관적으로 정리한 것입니다.
   잘못된 정보나 보완이 필요한 부분을, 댓글 또는 메일로 보내주시면 많은 도움이 되겠습니다.
12 19, 2015 18:56 12 19, 2015 18:56


Trackback URL : http://develop.sunshiny.co.kr/trackback/1046

Leave a comment

« Previous : 1 : ... 9 : 10 : 11 : 12 : 13 : 14 : 15 : 16 : 17 : ... 648 : Next »

Recent Posts

  1. HDFS - Python Encoding 오류 처리
  2. HP - Vertica ROS Container 관련 오류...
  3. HDFS - Hive 실행시 System Time 오류
  4. HP - Vertica 사용자 쿼리 이력 테이블...
  5. Client에서 HDFS 환경의 데이터 처리시...

Recent Comments

  1. 안녕하세요^^ 배그핵
  2. 안녕하세요^^ 도움이 되셨다니, 저... sunshiny
  3. 정말 큰 도움이 되었습니다.. 감사합... 사랑은
  4. 네, 안녕하세요. 댓글 남겨 주셔서... sunshiny
  5. 감사합니다 많은 도움 되었습니다!ㅎㅎ 프리시퀸스

Recent Trackbacks

  1. wireless communication systems wireless communication systems %M
  2. amazon fire television amazon fire television %M
  3. how to broadcast your own tv station how to broadcast your own tv station %M
  4. elapsed time clock for operating r... elapsed time clock for operating r... %M
  5. Mysql - mysql 설치후 Character set... 멀고 가까움이 다르기 때문 %M

Calendar

«   12 2019   »
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31        

Bookmarks

  1. 위키피디아
  2. MysqlKorea
  3. 오라클 클럽
  4. API - Java
  5. Apache Hadoop API
  6. Apache Software Foundation
  7. HDFS 생태계 솔루션
  8. DNSBL - Spam Database Lookup
  9. Ready System
  10. Solaris Freeware
  11. Linux-Site
  12. 윈디하나의 솔라나라

Site Stats

TOTAL 2780519 HIT
TODAY 102 HIT
YESTERDAY 1360 HIT