Make Batch count for Incremental loading in Talend (TOS_5.3)

Posted on by By Nikhilesh, in Business Intelligence, ETL, Talend | 0

This blog will talk about how to make Batch count for Incremental loading in Talend (TOS_5.3).

First all we have t_source and t_target tables
Both Tables have (t_source,t_target) data like this….

before_execute_job

Objective

INSERT into t_source

We inserted one record into t_source
Insert into t_source(id,name,city,s_date) values (111,’vic’,’del’,’2014-03-01 01:02:00′)
UPDATE from t_source
We updated from t_source
Update t_source set name=’don’,s_date=’2014-02-01 01:02:00′ where id = 109;
DELETE from t_source
We deleted from t_source
Delete from t_source where id = 108;

Finally we have records from t_source table and t_target tables

update_Tsource

We want make Batch count in TALEND(TOS)

We created one job…

test

Details of Job

Subjob (1)

We are fetched max(id) from t_target table and we updated into property variable

context.last_id = input_row.id;

Subjob (2)

We are fetching min (id) and max (id) from t_source and we updated into property variables

context.sr_max_id = input_row.max_id;

context.sr_min_id = input_row.min_id;

Subjob (3)

we are selecting from t_source

select * from t_source where  id > “+context.last_id+” order by id

and insert into t_target table by primary key is id

Subjob(4)

we need to count between primary key from t_source

select count(*) as batch_count from t_source where id between “+context.sr_min_id+” and “+context.sr_max_id+”

and updated into property variable. We want to calculate Batch count

We will define by divide count (context.MT_COUNT = 5)  . context.max_count, context.min_count is 0 before Execution of job.

context.count = input_row.count;

System.out.println(“Count of primary key from source “+context.UPLOAD_FILE_NAME+” Table : “+context.count);

 

int x = (context.count / context.MT_COUNT) + 3;

context.batch_count = x;

System.out.println(“Batch Count : “+context.batch_count);

context.max_count = context.MT_COUNT;

 

context.min_count = context.sr_min_id ;

context.max_count = context.sr_min_id + context.max_count;

SubJob (5)

We will iterate by context.batch_count. We have another job(test123) by Iterating.

1.Test123 Job

test123a.       SubJob(5.1)

We are printing Batch count min_count to max_count

System.out.println(“Batch “+Numeric.sequence(“s1″,1,1)+”: Count of “+context.min_count+” to “+context.max_count);    

b.      Subjob(5.2)

We are selecting from t_source between primary key

select * from t_source  where id >= “+context.min_count+” and id <= “+context.max_count+” order by id

and collects data into Buffer output

c.       SubJob (5.3)

We compared by inner join from Buffer input(t_source) and t_target tables in tmap. If any reject output will be there then updated into t_target.

T_target sql query: select * from t_target  where id >= “+context.min_count+” and id <= “+context.max_count+” order by id

d.      SubJob (5.4)

We compared by left outer join from t_target and Buffer input(t_source) in tmap. We filtered t_source.id == 0 and if any output is there then deleted

T_target sql query: select * from t_target  where id >= “+context.min_count+” and id <= “+context.max_count+” order by id

And we have t_javarow(Min, Max of  p_key)

In that,

context.min_count = input_row.id;

context.max_count = context.min_count + context.MT_COUNT;

Results

We Executed the job by defined (property variables)MT_COUNT = 5.

extecion

 

Finally we have records from t_source and t_target.

afterexecited

Thanks & regards

Vishwanth suraparaju

Senior ETL Developer

logo

Best Open Source Business Intelligence Software Helical Insight is Here

logo

A Business Intelligence Framework

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments