Friday, 4 January 2013

Understanding tFlowtoIterate component - Talend Open Studio

Today, I am going to demonstrate tFlowtoIterate component. Like tFileList, tFlowtoIterate is also a part of the  Orchestration folder in the Palette pane. tFlowToIterate transforms data flow into a list and allows to transform processable flow into non processable data. E.g. we can transfer the values to input components like tFileInputDelimited, tMySqlInput etc.

For demonstration, I am going to create a Job that reads a list of files from a defined input file, iterates on each of the files, selects input data and displays the output on the Run log console.



First, look at the input file which contains list of files.




Below is the Directory structure as well and its files. We have three files which contains department information in directory D:\TalendFiles\dir (refer screenshot below).

 

All the three files have same schema and contains departmentId and departmentName. lokk at all the three files below.



Now our aim is to read all the files iteratively and display in Run log. To achieve this create and new job and perform following steps.

1. Drag tFileInputDelimited component from Palette pane to Job Designer pane.

2. Open the tFileInputDelimited component properties and enter the path of input file having the list of file.

3. Click on Edit schema in the component properties to provide the metadata for the input delimited file. Click on  button to add column and provide the name and type to the column as shown below:



Click Ok.

4. Now, Drag tFlowtoIterate and another tFileInputDelimited components from Palette pane to Job Designer.

5. Right Click component tFileInputDelimited created in step 1 and select Row > Main and connect it to tFlowtoIterate component.

6. Right Click component tFlowtoIterate and select Row > Iterate and connect it tFileInputDelimited component dragged in step 4. Iterate Link will execute tFileInputDelimited component for the number of files received from tFileInputDelimited (step1) component.

7. Open the component properties of tInputFileDelimited component. Now we need to provide the name of the file dynamically. Hence, remove the existing path from the File name/Stream text box and press Ctrl + Spacebar from keyboard.

Talend will present all the Global variables which we can use. Select tFlowtoIterate_1.filename as shown in the screenshot below. This will pass the path of the file from tFlowtoIterate component.



8. File name/Stream will be populated with ((String)globalMap.get("row1.filename")). Now change the Field separator to “,” as our files  are comma separated.


9: Click on Edit Schema to provide the schema of the files. In the Popup add two columns departmentID and departmentName  as shown in screenshot below. Click Ok.


10. Drag tLogRow component from Palette pane and Right click tFileInputDelimited component and Row > Main and connect to tLogRow component.


11. Its time to run our Job. Look at the screenshot below. tFlowtoIterate has found 3 files and executed other components for every file. 



This is How we can use tFlowToIterate component. We can also use this component to pass the values to input database components which will be like doing a dynamic database look up  In the next article I will try to show you how to perform dynamic database look up.

You may also like to read..

2 comments: