Self-Service Data Preparation with Data Preparer

Data Preparer explores how data sources relate to each other and a target table, and populates the target from the sources. In Data Preparer, you describe what you need, not how it should be produced. As a result, there is no more scripting or hand crafting of workflows. Here, Data Preparer is illustrated integrating data from London open government data sets.

You can see Data Preparer in action in the videos in our YouTube channel, watch the video on the creation of a data wrangling scenario in Data Preparer, or read on for the steps involved in wrangling data.

Data Preparer Logo

Creating a Wrangling Scenario in Data Preparer

Defining a target for data wrangling

Step 1: Define the target

Describe a table and its attributes that are to be populated with data from several different sources. In this example, we are interested in social factors influencing attainment data about schools.

Step 2: Identify the sources

Identify source files or database tables that together can contribute to the population of the target. In this example, several (not always relevant) data sets are available.

Define the data context for the data to be wrangled.

Step 3: Define the data context

Identify example data or reference data sets that align with target attributes. In this example, we have access to information on the areas of interest.

State preferences to guide data wrangling

Step 4: State preferences

Specify the data quality properties that you would most like to have satisfied by the target. In this example, the goal is to maximise the completeness of the result table and specific attributes.

The result of data wrangling

Step 5. Wrangle and View Result

Press wrangle, and the target is populated with an end product from the sources.

Provenance of the wrangling result.

Step 6. Examine and Refine

Examine the result and how it was produced. If not as required, change preferences, give feedback or steer the wrangling process.

How different was that?