On the 21-23 September 2011, I attended the Eleventh International Conference on Dublin Core and Metadata Applications, known as DC-2011 to its friends but #dcmi11 to the true elite. The National Library of the Netherlands (KB) in The Hague made a pleasant setting for the event, although it was perhaps too small. That is to say, the public portion of it did not have sufficient rooms for all the parallel sessions, so some had to be held deep in the secure area of the building. This, as you can imagine, caused headaches for delegates and hosts alike and restricted movement between sessions. In spite of this there was a friendly and lively atmosphere.
On the first day there were tutorial sessions introducing the world of Dublin Core to those less familiar with it. I was not able to attend, and I feel I missed out as people kept telling me about meerkats being behind the name for the original 15 Dublin Core elements. Or something like that.
The conference proper kicked off on the second day with Mikael Nilsson explaining that interoperability (system B understanding what system A produced) is insufficient, and what we really need is harmonization. In other words, metadata that conform to multiple specifications, and systems that can understand and integrate multiple metadata schemes. If you're familiar with RDF and application profiles, you can see where this is going.
In the following plenary session, Jae-Eun Baek used a task-based, 5W1H model to compare different archival and preservation metadata schemes. The 5W1H refers to questions that the metadata are supposed to answer about a task: who does it, why they do it, what they do it to, and so on. The model revealed how different metadata schemes concentrate on different lifecycle stages. This was followed by Kai Eckert, who explained how the Dublin Core Abstract Model needs to be extended in order to provide proper support for recording the provenance of metadata. It involves allowing Description Sets to be the subject of further Descriptions (specifically Annotations); if you know about RDF named graphs, you'll recognise the concept.
The next session was all about mapping between different schemes. Gordon Dunsire argued that to get the benefit of working with Semantic Web technologies, we need to avoid translating values into different formats, and instead concentrate on mapping out the relationships between the properties themselves. Ahsan Morshed talked about how concepts in AGROVOC (an agricultural thesaurus) were mapped to other vocabularies; of particular interest was the way multiple languages were used to pin down the concepts in question. Lastly, Nuno Freire reported on efforts to transform subject headings from various schemes into sets of more specific properties (times, places, events), to make them easier for computers to work with.
The afternoon saw proceedings split into project reports and Dublin Core Community and Task Group workshops. I was involved in the Science and Metadata Community workshop. Jian Qin gave an update on the work she and I are doing with DataCite to produce a Dublin Core Application Profile version of the DataCite Metadata Specification. I gave an overview of current scientific metadata schemes with the aid of some diagrams based on the scoping study I conducted a couple of years ago. The other highlight was a presentation from Michael Lauruhn and Véronique Malaisé of Elsevier on their work with linked data, including the Elsevier Merged Medical Taxonomy (EMMeT) and the Data to Semantics research project.
The talk by Emmanuelle Bermès that kicked off the final day will probably best be remembered for its cookery metaphors, especially the 'stone soup'. If you're not aware of the fable that features stone soup, think of it as a benign slippery slope: some people who weren't willing to help make soup were persuaded instead to incrementally improve boiling water (with stones in) until it became soup. If data are the ingredients, and a functional web of linked data is the soup we're after, what are the 'stones' that will catalyse the transformation from one to the other?
The third plenary session presented the experience of people working with linked data. Antoine Isaac recounted how the Europeana digital library has been making a transition from Europeana Semantic Elements to the (linked-data-friendly) Europeana Data Model, the design decisions they had to make and problems they had normalizing their stock of data. Daniel Vila-Suero justified the style guidelines he and his colleagues have been working on for naming and labelling ontologies in the Multilingual Web. These are being trialled with IFLA's implementation of the FRBR model in RDF. Benjamin Zapilko talked about trying to perform statistical analysis directly through SPARQL. One of his conclusions was that it would probably be better to teach statistical packages SPARQL than to teach SPARQL statistics.
The final plenary collected some more examples of metadata usage in practice. Jörg Brunsmann gave the latest from the SHAMAN Project on handling engineering data, although of most interest to me was how he introduced the notion of Metadata Information Packages to OAIS. Mohammed Ourabah Soualah described the challenges of agreeing a common protocol for cataloguing Arabic manuscripts in Dublin Core, for a cross-search application. Finally, we had a screencast recorded by Oksana Zavalina on the different ways in which digital library collections handled collection-level metadata using the DC Collection Application Profile.
The afternoon was again a mixture of project updates and Community/Task Group meetings. The Registry Community meeting was largely taken up with discussions about the proposed requirements for a new system to manage DCMI's namespaces (and any that its Communities might want to set up). The highlight of the projects session was a paper on encoding the relationships between jazz musicians (e.g. influencedBy, mentorOf) in RDF.
The closing plenary consisted of two videos. The first was from the Free Your Metadata project, who provide guidance on using Google Refine to publish Linked Open Data. The second was an extensive and tuneful tourism advertisement for Malaysia, the host country for next year's conference.
That was my first experience of the Dublin Core conference, but with up to six parallel streams each afternoon, I can't claim to have a representative view on it. There was entire unconference component I didn't experience at all. If there is a common theme I can pick out, it is that the technology still hasn't caught up with demands of people working with the thornier issues of metadata. There was palpable impatience for Named Graphs to become an official part of RDF, for instance. I see a lot of potential for great work to come out of the Community meetings that form a major part of the Conference, and although I'm clearly biased, my own Community meeting was the highlight for me.