vrijdag 29 oktober 2010

Quipu: A datavault generator

Today a post about Quipu. This is an open source generator tool. It generates a Datavault data warehouse model, including a staging environment, based on a source data model. This source data model is reverse engineered and stored in the quipu repository in a datastore as a source model. From this source model it's possible to generate a staging model in the datastore (DDL). Also the load code is generated. From the staging tables the datavault model is generated using the staging data model. It's possible to improve the staging model by identifying business keys, marking tables as reference tables or defining aditional relations between tables. From this point it's possible to generate the datavault model. Below i'll show you an example based on the AdventureWorksLT2008R2.

Installation
The installation of Quipu is a easy. I've installed the following components:
  • The Quipu back-end.
  • The Quipu front-end.
  • Java 6 (version 1.6) runtime environment.
For testing Quipu i decided to use the AdventureWorksLT2008R2 database. This is a small subset of the AdventureWorks2008R2 database and the complexity is comparable with the Northwind database.

Start start_quipu.bat file in folder (not sure why, but i think it's a webserver).

Login to Quipu
Okay let's go....Not..i had some problems with the loginscreen of Quipu. At first i got the following window when i tried to log into Quipu:


After trying to connect to the localhost i got the following error:


Network error IOEXception? That doesn't look good. I've found out that TCP/IP setting in the configuration manager of SQL Server was not properly enabled. So i enabled that! The following error occured:



I took me a while to understand this error. The server setting 'authentication' of SQL Server was 'Windows only' and Quipu doesn't allow Windows authentication. So i changed this setting to Windows and SQL authentication and i was in....

Create a connection
A connection needs to be created to a server where the database resides that will be used for creating a datavault model.

Create a datastore
The next step is creating a datastore in Quipu. A datastore is a subset of the repository on a specific connection. You can create a datastore for each user, for a certain work project, or separate datastores for source, staging and datavault schemas. The definition of a datastore is up to you, but you should have at least one datastore for each connection you want to use.



Import schemas
The following step of creating a data vault is identifying the structure of your source data to the Quipu repository. Create a connection to the AdventureWorksLT2008R2 source database and Quipu extracts the meta data directly from the source schema.


Create a staging environment
A staging area is a copy of your source with some additional meta-data. The repository will be used for creating DDL and load scripts for a database.


Generate datavault
After loading a staging schema and adjusting all schema properties accordingly there are several extra options you can choose to manipulate the structure of the datavault you want to generate.



There are some options which i'll blog later on.




Generate the code
After setting all the options, choose a target datastore for the data vault and select a name (an existing name overwrites the existing data vault schema) and push generate!


A logging screen appears which shows the decisions that Quipu has made whether a sourcetable is a hub, satellite or a link in the datavault schema.


Finally a SQL script is generated. The only thing i had to adjust was include the User defined type in the script and it was all set and go!


 The result of the script is shown belown:


Conclusion
I've to admit that Quipu is a great tool of generating SQL code that could be implemented in datawarehouse projects. In the short time i spend with the tool it didn't crash or i could find a bug, except for the modelviewer, but that is a beta version, so i assume that they will improve this in a future version.

I'll blog later about the generated datavault schema.

Greetz,
Hennie

3 opmerkingen:

  1. Hi Hennie,

    I really like your blog,
    Pleased to meet you,

    Regards,
    Amin

    BeantwoordenVerwijderen
  2. Hi Hennie,

    I'm happy to see you investigate a lot of time in Quipu. Thanks a lot!

    Probably you would also like to read this blogpost:

    http://johannesvdb.blogspot.com/

    Regards,

    Pieter

    BeantwoordenVerwijderen
  3. Hi Pieter,
    thnx for ur reply. Nice blog! i'll read your pdf!
    Gr,
    Hennie

    BeantwoordenVerwijderen