Monday 6 May 2013

How to Execute multiple SQL queries in Talend

Lot of time we need to execute multiple SQL queries in a ETL Job. For instance, we need to setup database table on machine where we do not have direct access to the database. Also, we may need to truncate or drop temporary tables that are created as part of the ETL job for processing and temporary purpose. We can easily achieve this using Talend’s tMy<xxx>Row component. Here xxx refers to the database type. e.g. for MySQL the name of component is tMySQLRow and for oracle its tOracleRow.


We can execute multiple SQL database queries in Talend using one of the two techniques mentioned below:


1. Embedded SQL statements in Talend Job. (tMySQLRow)

2. SQL statements in file which will be read in Talend job.

1. Embedded SQL statements in Talend Job:  In this method we hard code the SQL statements in the Talend tMySQLRow component.









We have to set the additional JDBC parameters to allow multiple queries to be executed. To do this enter "allowMultiQueries=true" in the Additional JDBC parameters text box on the tMySQLConnection component.



Enter the multiple SQL separated by semicolon “;” in the Query text box.



This method is generally used when we have specific set of SQL queries that need to be executed in the Talend Job. We can not change the queries with every run.

2. SQL statements in file which will be read in Talend job. In this method we keep the SQL queries to be executed in the external file. In this we can control what all queries should run when the Job executes.



In this Talend Job, component tFileInputFullRow will read each SQL query from the file and then tMySqlRow will execute the query one by one.

To execute the queries one by one dynamically, we need to enter row1.query (query is the name of the filed to be fetched from tFileInputFullRow component) in the Query box instead of the hard coded queries. What we are doing here is that we are reading SQL queries one by one and passing it to tMySQLRow component for execution.



We can use this Job as a sub job and pass the path of the input file having multiple SQL queries as context parameter to this child job. In this way we can use this as a generic child job multiple times in a project. This will help us to execute multiple SQL queries, when we want to control the sql queries to be executed.

Lets Demonstrate both techniques using an example:

For demonstration I am going to take following emp_details table.Schema of the table is shown below:

desc talend_demo.emp_details;









Initially we do not have any data in the table.

Select * from talend_demo.emp_details;
/* 0 rows affected, 0 rows found. Duration for 1 query: 0.000 sec. */

We will run following SQL statements:

INSERT INTO talend_demo.emp_details VALUES (101,"John Miller",10,20000);
INSERT INTO talend_demo.emp_details VALUES (102,"David Kate",11,30000);
INSERT INTO talend_demo.emp_details VALUES (103,"Alex M",10,40000);
INSERT INTO talend_demo.emp_details VALUES (104,"Brad L",10,10000);
INSERT INTO talend_demo.emp_details VALUES (105,"Angel K",12,20000);
INSERT INTO talend_demo.emp_details VALUES (106,"Steve L",11,25000);
ALTER TABLE talend_demo.emp_details MODIFY emp_dept int;


Technique 1.  Embedded SQL statements in Talend Job. (tMySQLRow)

Execute the Job

Select * from talend_demo.emp_details;
/* 0 rows affected, 7 rows found. Duration for 1 query: 0.000 sec. */













You can see that 6 records has been added to the table as per following  insert statements.

INSERT INTO talend_demo.emp_details VALUES (101,"John Miller",10,20000);
INSERT INTO talend_demo.emp_details VALUES (102,"David Kate",11,30000);
INSERT INTO talend_demo.emp_details VALUES (103,"Alex M",10,40000);
INSERT INTO talend_demo.emp_details VALUES (104,"Brad L",10,10000);
INSERT INTO talend_demo.emp_details VALUES (105,"Angel K",12,20000);
INSERT INTO talend_demo.emp_details VALUES (106,"Steve L",11,25000);

desc talend_demo.emp_details;
/* 0 rows affected, 4 rows found. Duration for 1 query: 0.000 sec. */

Similarly, Table structure has been altered as per following query:

ALTER TABLE talend_demo.emp_details MODIFY emp_dept int;

Technique 2. SQL statements in file which will be read in Talend job.

Execute the Job



Select * from talend_demo.emp_details;
/* 0 rows affected, 7 rows found. Duration for 1 query: 0.000 sec. */


You can see that 6 records has been added to the table as per following  insert statements.

INSERT INTO talend_demo.emp_details VALUES (101,"John Miller",10,20000);
INSERT INTO talend_demo.emp_details VALUES (102,"David Kate",11,30000);
INSERT INTO talend_demo.emp_details VALUES (103,"Alex M",10,40000);
INSERT INTO talend_demo.emp_details VALUES (104,"Brad L",10,10000);
INSERT INTO talend_demo.emp_details VALUES (105,"Angel K",12,20000);
INSERT INTO talend_demo.emp_details VALUES (106,"Steve L",11,25000);

desc talend_demo.emp_details;
/* 0 rows affected, 4 rows found. Duration for 1 query: 0.000 sec. */


Similarly, Table structure has been altered as per following query:

ALTER TABLE talend_demo.emp_details MODIFY emp_dept int;

Let me know, if you have followed any other approach to execute multiple SQL queries in Talend. Looking forward to your comments.

16 comments:

  1. If I try to run with toraclerow component and using technique 2, Im getting error like invalid SQL statement

    ReplyDelete
    Replies
    1. My Querry was : "Select 'YourMoM' from dual;"

      Delete
    2. I guess you should run some query which doesn't expect any results. Something like DML statements such as ALTER, INSERT etc

      Delete
  2. How to execute multiple queries in tredshiftrow ??

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. I just recently discovered your blog and have now scrolled through the entire thing several times. I am very impressed and inspired by your skill and creativity, and your "style" is very much in line with mine. I hope you keep blogging and sharing your design idea
    data science training in chennai

    data science training in velachery

    android training in chennai

    android training in velachery

    devops training in chennai

    devops training in velachery

    artificial intelligence training in chennai

    artificial intelligence training in velachery

    ReplyDelete
  5. Great tips and very easy to understand. This will definitely be very useful for me when I get a chance to start my blog. ExcelR Data Science Course In Pune

    ReplyDelete
  6. Thanks a lot for all your valuable articles! We are really happy about your thoughts...
    Digital marketing Training in Bangalore

    ReplyDelete