With more and more Linked Open Data published, it’s harder for users to consume them. If you publish just one dataset it’s pretty clear what you are offering and what they could consume. With a larger knowledge base, they have to start by learning its structure. In particular, they need to recognize in which dataset they could find the information they are looking for. That’s why we wanted to come up with a tool that would visually help users to explore a knowledge base and more importantly to enable them to visualise its content in a traditional way. Such a tool is our LDVMi, a part of the ODN platform.
To discover how the LDVMi works in practise, please watch this demo.
As of today, users can search for the datasets in a CKAN catalogue instance which provides full–text search and keyword and faceted browsing of the textual metadata of the datasets. To be able to decide whether a given dataset is valuable for his use case or not, an expert needs to find out whether it contains the expected entities and their properties. In addition, the entities and properties can be present in the dataset, but they may be described using a different vocabulary than expected. A good documentation of datasets is rare, therefore, the expert needs to go through the dataset manually, either by loading it to his own graph database and examining it using SPARQL queries or by looking at the RDF serialization in a text editor.
The aforementioned problem inspired us to create a visualisation tool, LDVMi, which is also a part of the ODN platform. It’s goal is to provide a consumer with a list of visualisations that are meaningful for a given dataset. It’s much easier for a user to point to a dataset and let the application to generate a list of suitable visualisations. When done, the user chooses one of the possibilities to see his data visualised by the selected technique, e.g. on a map, using a hierarchy visualizer, etc. With this tool one should be able to quickly sample a previously unknown dataset that may be potentially useful based on its textual description such as the one in a CKAN catalogue. Also, one should be able to quickly and easily show someone what can be done with his data in RDF.
LDVMi also helps us to make the CKAN catalogue even more valuable. When describing a dataset in CKAN, you are able to add a visualisation preview. Therefore, if a user is browsing through your knowledge base, he is directly able to preview its contents.
Of course, not every dataset could be easily visualised. That’s why LDVMi is based on an abstract Linked Data Visualisation Model (LDVM). LDVM is designed to deal with various situations and create as many visualisations as possible. To discover a dataset that could be visualised, LDVMi utilizes so-called compatibility checking principle. A simplified version of the principle is built on a fact that a visualizer (e.g. Google Maps visualizer or DataCube visualiser) expects the dataset to contain certain patterns (given by vocabularies). If the application finds those patterns in your data, it automatically offers you the corresponding visualisations. Compatibility checking could, in fact, help you to publish your data correctly, using the most used vocabularies. You can update your dataset and check if LDVMi can discover your dataset and check the resulting visualisation.
However, this simplified version would force you to publish your data in a way that would be directly compatible with some visualiser. Even though that could be beneficial in some cases (like DataCube for statistical data), you probably shouldn’t do that in the majority of cases. You should, of course, publish your data according to the most suitable vocabulary. That’s why the full-featured compatibility checking is an iterative process that is able to assemble visualisation pipelines. Such a pipeline can combine multiple datasets and repeatedly transform them in order to prepare them for a visualisation.
About the authors:
- Jiri Helmich is a Ph.D. student at DSE Charles University in Prague. He’s been working with Linked (Open) Data for over 4 years now. His research focuses on visualizing Linked Data and its results are used for the development LDVMi as a part of his dissertation.
- Jakub Klímek is a researcher at DSE Charles University in Prague. He received his Ph.D. in computer science in 2013 and he has published more than 35 conference and journal papers in the area of conceptual modelling and Linked Opened Data.
Social tagging: open data > Visualization of Data