Does data fuel your organization?

All your decisions, solutions, arguments, approaches, resources usage and more over; people’s lives are all directly depending on how you consider your big data.

Big Data Management (BDM) is the key to build Data Integration, Data Quality and Data Governance processes for the kind of extensive data platforms.

iXperts has expert experience in the field of BDM. Gained by several assignments over the last years.

This whitepaper informs about a couple of customer value projects and technical details.

 

Use cases:

*Migration of PowerCenter mappings to BDM and utilize the power of parallel data processing.

*De-commissioning internal data warehouses and moving all of them to Azure or AWS, thus minimizing maintenance cost.

*Creating Data lake for data science / analytics purpose.

*Interacting with real time sources like Kafka, JMS and processing them using cluster’s parallel computing power

*Implementing full-fledged data quality projects (Informatica big data quality offering)

In the above customer assignments iXperts has delivered smooth data transition from data to information. Technologies that have supported these transitions are found in the traditional data domain like Informatica PowerCenter, Data quality tooling, Address doctor, MDM solutions and (Informatica) Big data management being the latest addition to gain maximum customer value.

 

A technical closer look 

Informatica Big Data Management (BDM) product is GUI based integrated development environment that organizations are using. Informatica BDM has built-in Smart Executor that supports various processing engines such as Blaze (Informatica’s proprietary Engine), Spark, Hive on MapReduce/Tez (obsolete from BDM 10.2.2). Informatica BDM combines the power of its intelligence for logic conversion and processing power for Hadoop cluster nodes which is what makes it strongest contender for Big data projects.

 

It fetches data from all the file sources, databases sources either through SQOOP (If JDBC connectivity is available) or native connectivity or ODBC connectivity and Hadoop sources like HIVE, HDFS, HBASE and so on. It adheres to all the security features of Hadoop clusters for seamless connectivity like Kerberos, Sentry, Ranger.

 

Informatica mappings are processed as a yarn application, may it be Blaze or Spark or Hive execution mode. Those modes have few advantages on one another depending on the use cases. For example, only Blaze can be used for performing data quality activities like data profiling, score carding and others.

 

Way of working:

  • When mappings are run in Hive execution mode to use MR/Tez engine, they are translated to Hive queries before they are actually submitted to Hadoop cluster and then executed as part of yarn application.
  • When mappings are run to use spark execution mode to use spark execution engine, they are translated to Scala code and then pushed down to Hadoop cluster where they get executed as yarn application.
  • When mappings are to use blaze execution engine, they are compiled and translated to traditional informatica code and then given to Hadoop cluster for execution. When the first ever blaze job is run, it launches blaze grid manager as a yarn application (it is analogous to IDQ data integration service) and then mapping is executed. Once blaze grid manager starts running, all the subsequent mappings execution requests would be handled by blaze grid manager if up and running.

Why our customers value Informatica BDM?

Feedback from our customers and own experience show that:

  • Development is much supported by a great GUI
  • Informatica BDM contains best of class features available
  • Project and end user support is optimal:
    • Excellent support and vast online community help
    • Very well written knowledge base articles and many more.

 

We as iXperts are already delivering this solution to many organizations and happy to support you in your data transition.

Thanks, Suraj