Monday 25 February 2013

Read XML having Multiple Nested Loops using Talend Open Studio

Today, I am going to demonstrate, How to read XML file having multiple nested loops in XML file.
Reading XML file is little bit difficult than reading flat files like excel, csv etc. Lot of people have queries or doesn’t properly know, How to read XML file which is having multiple nested loops.

For demonstration, I have created a sample XML file where tags are repeating in nested loops.

In the below XML, First <EMPLOYEE> tag will repeat for every employee then <YEARLY> child tag will appear for every year employee has worked and finally <MONTH> tag (child of <YEARLY> tag) will appear in loop for every month containing monthly salary.

Input File screenshot 1


















Input File screenshot 2 - Expand Yearly tag and you can see that <MONTH> tag is repeating.


































Now, Lets create a Job to read and process this kind of XML.

First and most important step is to create metadata for this XML. Click here for detailed information on How to create metadata for XML files.

Now, in our XML file we have 3 repeating tags (<EMPLOYEE>,<YEARLY> and <MONTH>) but while creating metadata schema of XML we can only provide one element for loop expression. So which one to provide for this file???

Whenever, We have nested multiple loops like the one we have in this file. Always select the lowest level of CHILD tag as a loop expression. Hence in this example we will select <MONTH> tag and drag it to Xpath Loop expression section.

Select all the other required fields to “Fields to extract” section.

































When we select the lowest level of tag as loop element then what will happen is all the fields of the parents tag will repeat for every loop element tag. In our example values of fields EMP_ID, EMP_NAME, year, year_sal will repeat for every <MONTH> tag values.

Once metadata for XML is complete, drag it to job designer pane and select tFileInputXML componen. Also, drag tLogRow component which will help to display the records on Run console.








Run the Job and check the output.


















You can see from the output that for every month_id and month_sal all the other parent fields EMP_ID, EMP_NAME, year, year_sal are repeating.

Now, you have all the data from XML into the JOb, you can use it the way you want to.

In this article, I have shown you, How to read XML file having repeating tags in nested loops.

In next post, I will show you How to read XML file having multiple occurring tag as same level.


You may also like to read..

4 comments:

  1. Great Article Vikram. I am also a big fan of Talend. Keep Sharing!!

    -Pankaj Khurana

    ReplyDelete
  2. Please share demo on loading XML data into Oracle table.

    And if we have XML data as url then we need to get as XML document. I tried with tfilefetch component but I got as file type but I need XML document type.

    Please help me solve this.
    Thanks in advance

    ReplyDelete