Workspace paths to avoid:
c:\Open Project\Talend Open
Studio\workspace
d:\My Projects\Talend\workspace
Recommended
Workspace Path:
c:\Talend\workpace
d:\OpenProject\Talend\workspace
c:\MyProject\repository
It is always recommended to not to have any space between the
Talend workspace path. We tend to encounter few issues if we have these spaces
in path. Hence we should always avoid these spaces.
2. Never
forget to perform Null
Handling.
Example
#1 – Bad
if(myString.length() >
0)
System.out.println(myString.toUpperCase());
Example
#2 – Good
if(!Relational.ISNULL(myString)
&& myString.length() > 0)
System.out.println(myString.toUpperCase());
Always perform NULL handling for the field which is going to be used
in any kind of expression. Otherwise Talend Job will throw NullPointerException.
3. Create Repository
Metadata for DB
connections and
retrieve database table schema for DB tables.
It allows you to quickly retrieve the schema of database tables
and help rapid development. If you will try to create schema for database table
one by one it will take long time. Click here for more details on creating DB connections and retrieving schema.
4. Use Repository
Schema for Files/DB
and DB connections.
It allows you to change the schema at one place, without having to
change the schema in every job. Also, you don't need to open every job to find out if
the changed schema is part of the Job or not. Changing at one place in
Repository will allow you to change in every job. Click on links below to know How to create Repository metadata:
Creating Metadata for Delimited files.
Creating Metadata for XML files.
Creating Metadata for Excel files.
Creating Metadata for XML files.
Creating Metadata for Excel files.
5. Create
Database connection using t<Vendor>Connection component
and use this connection in the Job. Do not make new connection with every
component.
As most of the database have maximum connection limit. In your
Talend Job if you are using multiple database components then it may fail
because of maximum allowed connection issue. Click here to know, How to share database connection.
6. Always
close the connection to database using t<Vendor>Close
component.
7. Create
a Repository Document corresponding to
every Talend job including revision history.
This will allow you to track the changes done on any Talend Job.
Talend Job Description:
In this paragraph, write down the high level description and functionality of Talend Job
Revision History
1.0 04-10-2014 Initial Development
1.1 07-10-2014 Modification to Source and Target repository Schema
1.2 09-10-2014 Modification to transformation logic.
8. Provide Sub Job title for every sub job to describe the sub job purpose/objective.
9. Avoid Hard Coding in Talend Job component. Instead use Talend context variables.
10. Create Context Groups in Repository
Context Group will allow you to use the same context variables in
any number of jobs without having to create again and assign value again to
them. Imagine your Project requires 20 context variables and there are 10 jobs
that require those context variables. Without context groups it will be very
difficult to create those context variables again and again in every job.
You can create different context groups for different
functionality of variables. For example, you can have different context group
for database parameters , SMTP params and SFTP params etc.
Click on links below to know more about context variables and context
groups:
1. Understand Context Variables Part 1 (Context Variables, Context groups)
2. Understand Context Variables Part 2 ( Define context
variables in Repository, which can be made available to multiple jobs)
3. Understand Context Variables Part 3 (Populate the values of context variables from file. tContextLoad)
4. How to Pass Context Variables to Child Jobs.
5. How to Pass context Variables/ Parameters through command line.
3. Understand Context Variables Part 3 (Populate the values of context variables from file. tContextLoad)
4. How to Pass Context Variables to Child Jobs.
5. How to Pass context Variables/ Parameters through command line.
Always provide the value of context variables either through database table or through Talend.properties file. Below is sample of Talend.properties file.
Click hereto understand How to Populate the values of context variables from fileusing tContextLoad component.
12. Create Variables in tMap and use the variables to assign the values to target fields.
For multiple use of single expression or for using the same mapping for multiple target fields, it is always good to create a variable in tMap and assign the value of that variable in target fields. It will allow to only evaluating the expression once for multiple number of times.
13. Create user routines/functions for common transformation and validation.
Always create routines/functions for all common transformations and validation rules which can be used in multiple Talend jobs.
Click hereto know, How to create user routines and functions.
14. Develop Talend job iteratively.
Divide the Talend Job to multiple sub jobs for easy maintainability. First create a subjob and then test it and then move to next sub job.
15. Always Exit Talend open studio before shutting down the PC.
Talend workspace may get corrupted sometimes, if you shutdown your machine before exiting Talend Open Studio. So always exit Talend before shutting down PC.
16. Always rename Main Flows in Talend Job to meaningful names.
Thanks to +Balázs Gunics for this point. It is always good to rename the main flows in Talend Job to more meaning full names so that when you refer the fields in tMap component or using tFlowIterate it will be easy to refer and understand which data is coming from which flow.
17. Always design Talend jobs by keeping performance in mind.
Thanks to +Viral Patel for this point. It is recommended to design the job by keeping performance of the job in mind. Visit this link to know "How to optimize the job in order to improve the performance".
Please let me know your thoughts on these points and also let me know, if you feel I have missed something.
This article is written by +Vikram Takkar and published on www.vikramtakkar.com, please let me know, if you see this article on any other website/blog.
No comments:
Post a Comment