Wednesday 10 July 2013

Parallel Execution Sub Jobs in Talend Open Studion

Most of the time we need to run few jobs/sub jobs in parallel to maximize the performance and reduce overall job execution time. However, Talend doesn’t automatically execute the subjobs in Parallel. E.g. If we have a Job which loads two different tables from two different files and there is no dependency between both loads then Talend will not automatically execute the Jobs in parallel. Talend will execute one of the sub job(randomly) and when one is finished then it start execution of the second subjob.  Lets take an example of simple Talend Job which has two asynchronous subjobs as mentioned in the below screen shot of Job design.  



Both of these sub jobs generate 5 sample records and print the following message:


First subjob prints: Generated from 1st tRowGenerator component.
Second subjob prints: Generated from 2nd tRowGenerator component
.
tJavaRow_1 code
System.out.println("Generated from 1st tRowGenerator component");


tJavaRow_2 code
System.out.println("Generated from 2nd tRowGenerator component");


Lets Run this Job and observe the Execution behaviour of these sub jobs.


From the above screen shot it is clear that Talend does not execute these asynchronous subjobs in parallel. These two subjobs are getting executed one after one in sequential random order.


Now Lets configure the Job to run these asynchronous sub jobs in parallel.


Step 1. From the menu, Click on Window and Select Show View and select the Job view under the Talend dir.



Step 2. Navigate to Extra tab in the Job pane window and check the Multi thread execution checkbox to enable these sub jobs to run in parallel.



Now lets run this Job again and observe the difference between the Execution behaviour


From the above screen shot it is clear that now these  asynchronous subjobs are executing in parallel.


Hence whenever we required to run few sub jobs in parallel e.g loading different files to different table, extracting data from different tables to different files etc, we can enable the multi threaded execution to run the sub jobs in Parallel.


This article is written by "Vikram Takkar" and published on www.vikramtakkar.com, please let me know, if you see this article on any other website/blog.

You may also like to read..

No comments:

Post a Comment