Having set up Siebel and Oracle Enterprise Data Quality (from now on in known as EDQ), I wanted to put it through it’s paces.
Real Time De-duplication works like a charm, provided you kick off the Real Time jobs in Director and have your Web Service URLs set up correctly.
Batch De-duplication, however, uses a different mechanism (JMX) and the out of the box installation and configuration doesn’t quite leave you in a position to run batch dedupe through the Siebel Client.
After a really useful conversation with Mike, Nick and Richard at Oracle (experts in EDQ and it’s integration into Siebel), I was able to make appropriate changes to the configuration to enable batch de-duplication. My heart felt thanks go out to them all for their dedication and commitment to helping lowly developers like myself!
JMX Port Configuration
By default, EDQ on Windows configures JMX to listen on port 9005. However, by default the Siebel Connector is configured to look on port 8090.
- Modify the entry in dnd.properties to match the port specified in director.properties. For example:
jmxserver = hostname:9005
- There is no need to restart anything. The next job to use the DQ Connector will automatically re-read the configuration
By default, the JRE used by EDQ publishes the JMX interface on localhost (127.0.0.1). Now this may be because of my set up on VirtualBox – it may or may not cause you a problem. However, if you see errors in the connector log relating to connecting to JMX, you may be experiencing this problem.
- Create a new file called jre.properties in the same folder as the director.properties file on the EDQ server. Using the default installation, this will be in:
- Within the file, add the following configuration item:
java.rmi.server.hostname = <EDQ HostName>
- Restart the Datanomic Application Server service
Test Batch DeDuplication
Testing is now straight forward:
- From within Siebel, navigate to Site Map > Administration – Server Management > Jobs
- Create a new job, using the ‘Batch Account match’ template
- Submit the job and await completion
- Navigate to Site Map > Administration – Data Quality > Duplicate Accounts
- See your deduplicated data and merge!
Having now used EDQ alongside Siebel, I am really, really impressed. Previous DQ attempts have felt really clunky but EDQ fits really nicely alongside Siebel. The real time deduplication works well and is very easy to configure. Batch cleansing and deduplication also works flawlessly, once the tweaks above have been applied.
I get the impression that Oracle are really committed to this software as a solution, too. Whereas SSA-NAME5 and ISS seemed like stop gap solutions, EDQ is feeling like an integrated technology and something that Oracle are building into their Fusion and Siebel roadmaps. Here’s hoping!