Couple of months back, I have written an article to demonstrate "Transpose Rows to columns" using +Talend Open Studio. I have received several requests and queries from readers to know "How to transpose or split a record to multiple records". So today, I am going to demonstrate the same with simple example.
Lets take an example of simple excel file which contains scores of an student in multiple subject in single record as mentioned in the pic below of source file.
From the input file above we would like to have four records for each student for different subjects in the output. To split the record to multiple records Talend provides an in-built component tSplitRow which split the records from source data flow and output multiple records based on the configuration of the Column Mapping table in its component properties.
Lets look at the Job design:
In the tSplitRow define the metadata as following: Basically e want different record for every subject for every student. Hence added SUBJECT and MARKS fields as output to tSplitRow component.
Lets take an example of simple excel file which contains scores of an student in multiple subject in single record as mentioned in the pic below of source file.
From the input file above we would like to have four records for each student for different subjects in the output. To split the record to multiple records Talend provides an in-built component tSplitRow which split the records from source data flow and output multiple records based on the configuration of the Column Mapping table in its component properties.
Lets look at the Job design:
For those who are not aware of creating metadata of an Excel files, Click here to visit the article.
Once the output metadata for the tSplitRow is defined. Its time to configure the component to derive the new records from source data flow. In the column mapping table click the + sign to add mapping from source to target record. For every mapping record it will generate 1 record from source file. If the source has 10 records and 4 mappings has been defined in the Columns Mapping table in tSplitRow component then the output will have 40 records.
Look at the mapping below to understand, How it works. In the EMP_NO and EMP_NAME I have mapped the values fro the source flow and value will be same for every record. In the SUBJECT mapping I have hard coded the value as per the mapping in the MARKS field.
Now run the Job to check the output.
We have 4 records for every record in the source file. So you have seen, How easy it is to split a single record to multiple using tSplitRow component in +Talend . Let me know, if you have any queries and issue implementing the same.
This article is written by +Vikram Takkar and published on www.vikramtakkar.com, please let me know, if you see this article on any other website/blog.
Hi Vikram thanks for this posting.
ReplyDelete