Automated Data Integration


Washington DC, VA USA. Date of Talk: 2014-10-27


Eric Huang
Luca Ceriani

Automated Data Integration

HiperFuse automates procedures for extracting, cleaning, restructuring and provisioning data, enabling investigators analyzing the data to spend more of their time focusing on what hypotheses to test, correlations to study, and data to mine activities that make the best use of their domain expertise. Big data problems have been put typically into three buckets: volume, velocity, and variety HiperFuse tackles variety primarily and volume secondarily. Users only need to declare the state of the input data and the desired output state in order to have data integrated for their analyses. The state of the input data of interest could be deduced, for example, from the data dictionary and quality control statistics on the dataset. HiperFuse uses a workflow planning and execution engine to discover and automate procedures for extracting, cleaning, restructuring and provisioning data in the environment that houses the data, freeing the user from needing to specify each step. This is valuable to any data analyst working with a variety of datasets not only speeding up analytical research within PARC, but also for external clients doing their own analytics across a variety of datasets.

Additional information

Focus Areas

Our work is centered around a series of Focus Areas that we believe are the future of science and technology.

Licensing & Commercialization Opportunities

We’re continually developing new technologies, many of which are available for¬†Commercialization.


Our scientists and staffers are active members and contributors to the science and technology communities.