Data Integration / Big Data

Big Data, Better Decisions
New types of data sources are constantly becoming relevant, data volumes continue to explode and data generation ceases to end. From social networks to mobile to the IoT, Big Data is everywhere and requires your organization to manage information in new and different ways. This challenge presents an incredible opportunity to acquire, blend, and enrich data using tools and techniques that can provide incredible insights. This awareness allows enterprises the ability to predict customer behavior, detect fraud, optimize spend, recommend products and services, and more. Learn not only what happened, but why it happened, and most importantly, what should be done differently in the future. That’s the amazing power of Big Data!

Our Big Data and Information Management Services Include:


  • Design and implement Big Data architectures to meet challenges related to storage, management, processing, and visualization of data. Processing enterprise and social data has four primary challenges: volume, variety, velocity, and veracity. Centrifuge LLC has established skills in implementing BigData solutions that address all of these areas.
  • Design, develop, test, and implement BigData projects on premise and in the cloud using tools such as Hadoop, Spark, NoSQLdatabases, distributed queues, streaming technologies, and distributed search engines.
  • Uncompromised third party evaluation of software, personnel,and processes: We evaluate software, personnel,and processes for system optimization and human productivity.


  • DataIntegration Pipelines: Design and build data pipelines that are able to integrate structured data from relational databases using pull-based (e.g.Sqoop) and push-based (e.g. Oracle Golden Gate Big Data Adapters) techniques and unstructured data acquired from various streaming technologies, process those data using custom transformations and security tags, and persisting those data into Hadoop (e.g. HDFS, HBASE, etc.)
  • Build scalable fault-tolerant streaming solutions using Kafka, Spark Streaming, Storm, and Samza (using Lambda and Kappa architectures)
  • Design and build distributed, scalable, and highly available search engines (using technologies like Elasticsearch and Solr)


  • Analytics for evidence-based support in decision making, quality care, performance management, fraud detection, and business intelligence(BI): identify recurring patterns, build metrics, and implement BI portals that optimize efficiency and cost savings.
  • Exploring and understanding complex relationships between entities using graph databases (e.g.Titan and Neo4j)


  • Exploratory Data Analysis using tools such as Platfora, Zoomdata, Tableau, R, Python, SPSS and SAS
  • Design and build custom web applications to visualize data in Hadoop, NoSQL databases , and search engines using tools such as (D3.js , Hive , Spark SQL , HBase, Impala, Elasticsearch, Kibana, etc.)