Harnessing the data deluge with automated data preparation.

A week’s worth of manual data preparation in minutes.

The Problem

Data scientists are employed to obtain insights from data, but are spending much less time on data analytics than data preparation. Data preparation is the process of discovering, cleaning and integrating data sets prior to analysis.

Data Preparation

Data preparation can take 80% of a data scientist’s time. Data preparation is so time consuming as it involves specification and coordination of several data selection, integration and cleaning steps.


Data preparation is essentially a software development activity, that involves writing transformation scripts, configuring components, and connecting these components together.

Commercial Products

Data preparation tools typically support data scientists by providing a visual programming interface and a library of components. However, both tactical and fine-grained decisions remain with the data scientist.

A New Approach

The Data Value Factory provides Data Preparer, a new approach without laborious hand-crafting of data preparation programs. In data preparer, you:

Describe what you need

Provide data sources, a target structure, example data and quality priorities.

Hand over to Data Preparer

Data Preparer explores how the data sources relate to each other and the target, and populates the target from the sources.

Refine the result.

Provide feedback on the automatically produced result or revise priorities, and Data Preparer will derive a new result.

