Tomas Knap

Industry

UnifiedViews in COMSODE pilot projects

The advent of Linked Data accelerates the evolution of the Web into an exponentially growing information space where the unprecedented volume of data offers information consumers a level of information integration that has up to now not been possible. There is plenty of tools to deal with RDF data - to extract, transform, cleanse and link RDF data. Nevertheless, management of RDF data transformation processes consisting of various components (tools), having different configuration and being configured and executed differently, employing different data flow patterns, was difficult, error-prone and confusing. To address these problems, we developed UnifiedViews (http://unifiedviews.eu), an Extract-Transform-Load (ETL) framework, that allows users to define, execute, monitor, debug, schedule, and share ETL data processing tasks. UnifiedViews differs from other ETL frameworks by natively supporting RDF data and ontologies. UnifiedViews provides set of plugins for working with RDF data and new custom plugins may be easily created. 

First version of UnifiedViews was developed two years ago as student project at Charles University in Prague and since that time, it was further used and extended in various projects transforming and publishing RDF data. One of them is COMSODE (http://www.comsode.eu) project, EU FP7 project finishing this year. The goal of the COMSODE project is to 1) prepare a publication platform (called Open Data Node) for publishing data as (linked) open data 2) provide methodology for publishing (linked) open data with Open Data Node, 3) transform and publish 150 datasets from various EU countries and institutions using Open Data Node platform. UnifiedViews is vital part of Open Data Node platform - it ensures transformation and publishing of datasets being published as (linked) open data. UnifiedViews was also further improved during COMSODE project and in March 2015 UnifiedViews 2.X was introduced.

In my talk, I would like to introduce 2 pilot projects we started as part of COMSODE project - one with Slovak Environment Agency (SEA, http://www.sazp.sk/public/index/index.php?lang=en) and one with Czech Trade Inspection Authority (CTIA, http://www.coi.cz/en/). The goal of each pilot was to demonstrate the capability of UnifiedViews (and Open Data Node as a whole) to transform and publish selected datasets - in SEA, we covered several of the datasets that SEA publishes to comply with the European Union's INSPIRE directive, including data on protected sites, species distribution, bio-geographical regions, and land cover and an additional dataset on contaminated sites registered as enviromental burdens; for CTIA, we were opening datasets with results of the companies' inspections, list of bans and penalties for companies, and list of confiscations for companies which supply or sell goods, provide services, operate marketplaces. For each pilot, I will present 1) the situation in the organisation and their motivation to publish selected datasets as (linked) open data, 2) the approach we use to transform and publish their data, 3) the benefits the organisations get by using UnifiedViews (and Open Data Node) and 4) lessons learned from using UnifiedViews (and Open Data Node) during the pilot projects. 

I will prepare online demo of UnifiedViews, where I focus on new features of UnifiedViews 2.x. I will also explain the role of UnifiedViews in Open Data Node platform.

CV

Dr. Tomáš Knap is a researcher at the Department of Software Engineering, Charles University in Prague and also software architect at EEA s.r.o. His research focuses on Linked Data management. He received his Ph.D. in Computer Science in 2013 from the Charles University in Prague, Czech Republic. He has published more than 15 papers and is a founding member of the OpenData.cz initiative. He participated in the EU FP7 projects LOD2 and COMSODE.

Linked Data Consultant