At Spinque Company we do not only like Open Data, we also like art. What better to combine them? In this post we illustrate how to use open data to enrich access to the art collection of the Rijksmuseum.
The Rijksmuseum in Amsterdam is one of the most famous museums in the world. The amazing collection of the museum is publicly available. You can search the collection at the Rijksmuseum Website and you can access the data through their API. The data contains artworks with descriptions (or annotations) provided by the cataloguers of the museum. Like many museums and other cultural heritage institutions these annotations cover the basic object characteristics such as the creator, date and material. In addition, cataloguers have described what is depicted on the artworks, the subject matter.
We did a little project at Spinque to explore different strategies to search in the Rijksmuseum collection. We started with an RDF representation of the artwork collection and the thesauri that are used to described the artworks. We first demonstrate how to make a basic search engine on top of this data. Next we integrate additional Open Data sets to enrich the search experience. We improve the ranking, enable multilingual search, and provide recommendation of related artists and artworks.
Searching RDF
Building a basic search engine for the Rijksmuseum collection is straight forward. If you want to know how you do this with Spinque’s search by strategy technology take a look at the screencast. The basic search strategy is available in the search engine at our demo page. Select the strategy from the drop-down list at the top left, it is named artwork search. Try to find the famous painting from Vermeer by searching with the Dutch title “melkmeisje”.
The ranking is probably not what you expected. Why do we not get the famous masterpiece by Vermeer as the first result? The problem is that the textual features in the collection do not predict if an artwork is famous or not. Using the facets to filter on the creator Johannes Vermeer we can solve the problem, but the point is that the public will expect the masterpieces on top. In this case there is a simple solution. The Rijksmuseum has explicitly annotated the famous artworks. With Spinque we can include this information into the search strategy.
In the search application at the demo page you can try the strategy with the improved ranking. The strategy is named artwork_search_top1000. Click the button change and select this strategy. Again try the query “melkmeisje”. Better?
Including DBPedia
The terms that are used by the cataloguers to annotate the artworks are maintained by the museum in their internal vocabularies (or thesauri). The vocabularies include names of artists and historical persons, geographical locations, art-specific concepts such as materials and content-specific concepts such as historical events. While some of these vocabularies are quite large, the information about the entities and concepts is rather sparse. For example, the materials are only available in Dutch, and for most artists only the basic biographical information is known.
The Web contains several open data sources that can complement this information. Wikipedia (or DBPedia) contains information about well known artists such as Vermeer, including relations to other artists. Other relevant sources are the Art & Architecture Thesaurus from the Getty Institute.
To make this open data available when searching the Rijksmuseum collection we first need to integrate it. One approach is to follow the principles of Linked Data and create links between the objects in the Rijksmuseum collection and DBPedia. If you are interested in this approach have a look at Amalgame and the SILK link discovery framework.
When we have links between the Rijksmuseum data and DBPedia we can prioritize the artworks that are found there (thus that are described on Wikipedia). This is an alternative (or maybe a complementary) approach to get the masterpieces on top.
At the demo page you can try the search strategy artwork_search_dbpedia. Searching for “melkmeisje” should give you the same result as before. But, DBPedia also contains the title of the artwork in other languages. Now you can also search in English “milkmaid”, in Spanish “La Lechera”, in Polish “Mleczarka” and other languages. Quite handy for a museum with an international audience.
Searching multiple result types
In the previous examples we searched for artworks. Sometimes the object of interest is something else, for example the artist (e.g. Vermeer). The search strategy aggregated_search illustrates how different result types can be integrated in a single result list. Search for “vermeer” to see an example of a result list containing artworks as well as artists.
Related artists and artworks
We can also use the rich information contained in DBPedia for recommendation. Select the result Vermeer, Johannes by clicking the + icon at the bottom right corner. You see more information about this artist on the right side of the application. The descriptive text about Vermeer comes from the Rijksmuseum thesaurus and the list of artworks created by Vermeer is also taken from the Rijksmuseum data. The list of related artists, however, is derived from DBPedia and is not available in the Rijksmuseum data itself. Using these links we can browse to other famous Dutch artists, such as Gabriel Metsu, and explore their artworks. To find the artworks created by Gabriel Metsu we again use information from the Rijksmuseum itself. In other words we just used DBPedia to browse from one artist in the Rijksmuseum to the artworks of a related artist.
Dr. Michiel Hildebrand received his PhD from University of Amsterdam in 2010 for his research on access to Linked Data. In 2014 he joined Spinque to help apply the companies search by strategy approach to Linked Data.