Talend Open Studio and EIM

I discovered an amazing Open Source ETL tool the other day called Talend Open Studio. We’re working on a mini project at the moment to bring Contact and Account data in from Excel spreadsheets and thought this a good chance to try it out.

The tool itself is very intuitive, being based on the well proven Eclipse platform and offers a drag and drop graphical interface for constructing business processes and jobs.

It’s perfectly suited to loading EIM tables: simply set up source and destination metadata (data source) entries for your input data and Siebel database then bring in the EIM schemas for the destination tables that you’re interested in. A number of components can be dragged and linked between the two to do any number of mapping and transformation steps. If you’re familiar with Actuate in the Siebel context, you’ll have a real head start as the concept of ‘rows’ and their flow through the components is key. A little sample of one of the jobs I’m currently working on is below:

Projects can be created in both Java and Perl flavours: Java being somewhat more familiar to me but Perl sneaks ahead with performance and general scalability. A limitation in Java prevents overly complex jobs which made Perl the only choice on this occasion. Mapping expressions can be used, in the language of your choosing, to perform all sorts of complex transformations. You can even create your own functions and components, if you need to do something particular fancy. I found the mapping component (tMap) pretty much suited to everything I needed on this occasion.

I suggest downloading a copy, setting up a local test project and having a mess about – I’d be really interested to hear from anyone who finds the tool useful. I’m considering developing a suit of Siebel components – the tool currently supports Sugar CRM, SAP and a number of others out of the box – and a community approach to this would be excellent.

Comments

  1. Hi,

    I found this post really helpful, thanks for writing this up.
    your Job design looks pretty cool:)

    I am currently using Talend Open Source for a HUGE Siebel ETL project, this seemed like the best option, considering all the easily doable transformations and complex processing Jobs and a LOT more…

    I’d be interested to hear about your experiences and challenges that you might have faced during your data project.

    Look forward to hearing from you!

    Cheers!
    Royston
    roy.goveia@gmail.com

    VA:F [1.9.13_1145]
    Rating: 0 (from 0 votes)
  2. Oli says:

    Hi Royston,

    Thank you for the kind comments and thank you for taking the time to post! It’s great to see Open Source tools making themselves useful in our Siebel world – I agree that Talend is a fantastic product and really well suited to data migration and EIM related tasks. I noticed that other day that Talend have released some Data Quality tools – I’d be most interested to see how we could integrate those into Siebel. At the end of the day, I’d much rather my clients used Open Source than pay huge amounts of money for closed source tools.

    If you’d like to contribute a post on the work you’ve done with Talend that would be great – I’d be happy to make you an honorary contributor! :)

    VN:F [1.9.13_1145]
    Rating: 0 (from 0 votes)
    • gpinkham says:

      did you end up doing any Data Quality with Talend and Siebel? I’m going to be looking into this as well and was curious..

      thanks!
      Gary

      VN:F [1.9.13_1145]
      Rating: 0 (from 0 votes)
      • Oli says:

        Hi Gary,

        We didn’t do anything Data Quality related with Talend – in fact, we failed to find any robust solution that would work nicely with Siebel. We tried Oracle Data Quality Manager as part of a Siebel 8 upgrade, but found it far too complex, clunky and unreliable. Part of the problem with QD in Siebel is trapping all of the data input events. For example, entering data in the ‘Quick Add’ applet on an entity home page would fail to trigger DQ deduplication.

        Regards,

        Oli

        VN:F [1.9.13_1145]
        Rating: 0 (from 0 votes)
  3. Anne says:

    Wether open source, or closed, it all comes down to a tool being suitable for the job at hand. The more flexible the licence is, the easier I can install and apply it as part of a solution. This is often the reason I prefer an open source tool over a closed source one.

    Talend Open Studio has been a great help when I needed to pump data from Oracle into PostGreSQL and do some transformation in the same process. However, as a standalone open source application it would not be suitable for a large scale ETL project. Then a more scalable solution like Talend Integration, Informatica or in the Open Source world Pentaho would be better.

    VA:F [1.9.13_1145]
    Rating: 0 (from 0 votes)
  4. Humphaz says:

    Could I ask, have you done more with Talend with Siebel?

    Would it be possible to expand this:

    “simply set up source and destination metadata (data source) entries for your input data and Siebel database then bring in the EIM schemas for the destination tables that you’re interested in.”

    Did you have to set up the mappings in EIM?

    Did you include the Siebel java components into Talend (assuming not but I have seen this done).

    Would really love to here some more on your experiences in this specific instance.

    VN:F [1.9.13_1145]
    Rating: 0 (from 0 votes)
    • Oli says:

      Hi Humphaz and welcome to the blog!

      To be honest, I have done nothing more with Talend since I posted that article.

      However, I still recall it being an exceptionally powerful tool and really seemed to work well with Siebel. The ‘Metadata’ object types, in Talend, include all sorts of mechanisms to represent data. In this instance, an Oracle connection to a Siebel database for loading EIM tables or a CSV file of data to be imported. It’s then possible to pull out tables from a schema and represent them in Talend jobs and component instances. Columns from the input source, be it a spreadsheet or other database, can then be manually mapped into Siebel table columns, in this case EIM table columns. You can set up transformations and all sorts using chained components to manipulate the data before it is loaded. Obviously, the mappings from the data source (whatever that might be) to the EIM tables is something you have to do manually – Talend and Siebel have no prior knowledge of what’s stored in that input source. The EIM Component then, obviously, takes care of loading into the Siebel base tables.

      No Siebel java components required – ‘connectivity’ into Siebel is directly through the database, via EIM. However, you could certainly employ the Siebel Java Data Bean to manipulate data directly through the Siebel OM. I haven’t tried this though.

      Downloading Talend is free and messing around with it is a great way to see what it can do.

      Regards,

      Oli

      VN:F [1.9.13_1145]
      Rating: 0 (from 0 votes)

Speak Your Mind