Data Preparer in Practice
Data Preparer explores how data sources relate to each other and a target table, and populates the target from the sources. In Data Preparer, you describe what you need, not how it should be produced. As a result, there is no more scripting or hand crafting of workflows. Here Data Preparer is illustrated integrating data from London open government data sets, in which the data is released under the Open Government License.
Step 1: Define the target
Describe a table and its attributes that are to be populated with data from several different sources. In this example, we are interested in social factors influencing attainment data about schools.
Step 2: Identify the sources
Identify source files or database tables that together can contribute to the population of the target. In this example, a collection of (not always relevant) data sets are available.
Step 3: Define the data context
Identify example data or reference data sets that align with target attributes. In this example, we have access to information on the areas of interest.
Step 4: State preferences
Specify the data quality properties that you would most like to have satisfied by the target. In this example, the goal is to maximise the completeness of the result table and specific attributes.
Step 5. Wrangle
Press wrangle, and Data Preparer will populate the target with an end product from the sources.
Step 6. View Result and Refine
View the result. If not as required, change preferences, give feedback or provide guidance on how the result has been produced.