Harnessing the data deluge with automated data preparation.

A week’s worth of manual data preparation in minutes.

The Problem

Data scientists are employed to obtain insights from data, but spend much less time on data analysis than data preparation. Data preparation, also known as data wrangling, is the process of discovering, cleaning and integrating data sets prior to analysis.

Data Preparation

Preparing data can take 80% of a data scientist’s time. Preparing data for analysis is so time consuming as it involves specification and coordination of several data selection, integration and cleaning steps.

Programming

Preparing data is often a software development activity. Programming for data preparation involves writing transformation scripts, configuring components, and connecting these components together.

Commercial Tools

Data preparation tools typically support data scientists by providing a visual programming interface and a library of components. However, both tactical and fine-grained decisions remain with the data scientist.

The Data Value Factory provides Data Preparer, a new approach without laborious hand-crafting of data preparation programs. In Data Preparer, you:

Describe what you need

Provide data sources, a target structure, quality priorities and example data. The target structure and quality priorities make explicit what you need, and the example data provides evidence that is used by Data Preparer to clean and integrate the data from the sources.

Hand over to Data Preparer

Data Preparer explores how the data sources relate to each other and the target, and populates the target from the sources. In populating the target, data preparer explores different ways in which the sources can be combined, and reformats data to increase consistency.

Refine the result

Provide feedback on the automatically produced result or revise priorities, and Data Preparer will derive a new result. Data Preparer provides provenance information so that it is clear how every result entry has been produced. This allows you to refine how the data is produced without the need for reprogramming.

Find Out More

Find out more

Read more about our approach to data preparation.

Try for yourself

A free download of Data Preparer is available, so that you can try it for yourself on your own data.

Stay in touch

Sign up for our newsletter, for additional information on data wrangling and our offerings.

Close Menu