The Power of Linked Open Data
It was my pleasure to contribute to “Linked Open Data: The Essentials – a quick start guide for decision makers” which has been published recently by Semantic Web Company and REEEP.
Here is an excerpt of my article, please find a link to download the publication in full as PDF at the bottom of this entry:
The Power of Linked Open Data – Understanding World Wide Web Consortium’s (W3C) vision of a new web of data
Imagine that the web is like a giant global database. You want to build a new application that shows the correspondence among economic growth, renewable energy consumption, mortality rates and public spending for education. You also want to improve user experience with mechanisms like faceted browsing. You can already do all of this today, but you probably won’t. Today’s measures for integrating information from different sources, otherwise known as mashing data, are often too time-consuming and too costly.
Two driving factors can cause this unpleasant situation: First of all, databases are still seen as „silos”, and people often do not want others to touch the database for which they are responsible. This way of thinking is based on some assumptions from the 1970s: that only a handful of experts are able to deal with databases and that only the IT department’s inner circle is able to understand the schema and the meaning of the data. This is obsolete. In today’s internet age, millions of developers are able to build valuable applications whenever they get interesting data. Secondly, data is still locked up in certain applications. The technical problem with today’s most common information architecture is that metadata and schema information are not separated well from application logics. Data cannot be re-used as easily as it should be. If someone designs a database, he or she often knows the certain application to be built on top.
If we stop emphasising which applications will use our data and focus instead on a meaningful description of the data itself, we will gain more momentum in the long run.
At its core, Open Data means that the data is open to any kind of application and this can be achieved if we use open standards like RDF to describe metadata.
Linked Data?
Nowadays, the idea of linking web pages by using hyperlinks is obvious, but this was a groundbreaking concept 20 years ago. We are in a similar
situation today since many organizations do not understand the idea of publishing data on the web, let alone why data on the web should be linked. The evolution of the web can be seen as follows:
Although the idea of Linked Open Data (LOD) has yet to be recognised as mainstream (like the web we all know today), there are a lot of LOD already available. The so called LOD cloud covers more than an estimated 50 billion facts from many different domains like geography, media, biology, chemistry, economy, energy, etc. The data is of varying quality and most of it can also be re-used for commercial purposes.
Using DBpedia to generate SKOS thesauri
In recent years, we have constantly discussed the application of thesauri and other knowledge models to improve search. Many people understand that thesaurus based search is in many cases better than search algorithms purely based on statistics. Of course the big contra always was, “the costs are too high to establish a good-enough thesaurus or even a high-quality one”.
Imagine you could generate any thesaurus you would like for nearly any knowledge domain you can think of with quite a good quality!
Sounds impossible? Reminds you of all the promises made by text mining software which generates “semantic nets” from scratch?
Here at the Semantic Web Company we have been working on SKOSsy for a while. I will explain what this web service can do for you:
SKOSsy generates SKOS based thesauri in German or in English for a domain you are interested in. SKOSsy extracts data from DBpedia, so it can cover anything which is in DBpedia. Thus, SKOSsy works well whenever a first seed thesaurus should be generated for a certain organisation or project. If you load the automatically generated thesaurus into an editor like PoolParty Thesaurus Manager (PPT) you can start to enrich the knowledge model by additional concepts, relations and links to other LOD sources. But you don´t have to start in the open countryside with your thesaurus project.
With SKOSsy in place custom-tailored text extractors can be produced with low effort. To sum up,
- SKOSsy makes heavy use of Linked Data sources, especially DBpedia
- SKOSsy can generate SKOS thesauri for virtually any domain within a few minutes
- Such thesauri can be improved, curated and extended to one´s individual needs but they serve usually as “good-enough” knowledge models for any semantic search application you like
- SKOSsy based semantic search usually outperform search algorithms based on statistics since they contain high-quality information about relations, labels and disambiguation
- SKOSsy works perfectly together with PoolParty product family
Which domains are you interested in? Give it a try!
I-Semantics 2011: Programme of Industry Track
The iPraxis Track of this year’s I-Semantics / I-Know Conference offers a unique forum for enterprise architects, information professionals and all practitioners: A comprehensive overview over real-world applications in the field of semantic technologies will be presented by experienced speakers demonstrating the power of semantic systems!
- Horst Baumgarten, Roche Diagnostics GmbH (D): A „Super-Glossary“ for Roche
- Andreas Blumauer, Semantic Web Company (A): Thesaurus based Enterprise Search – Two Show Cases
- Christian Fillies, SemTalk (D): Linked Data Interface, Semantics and a T-Box Triple Store for Microsoft SharePoint
- Georg Güntner, Salzburg New Media Lab (A): Linked Media Services: A Linked Data Approach for Enterprise Information Integration
- Peter Haase, fluid Operations (D): Enterprise Linked Data Applications with the Information Workbench
- Alexander Oelling, sones (D): Create and Uncover Semantic Relationships
- Vincenzo Pallotta, InterAnalytics (CH): Interaction Mining: making pragmatic sense of Internet conversations
- Bernhard Schandl, Gnowsis (A): Refinder: A Semantic Collaborative Productivity Tool
- Thomas Schandl, Semantic Web Company (A) / Bernhard Schandl, Gnowsis (A) / Stefan Wunder, Neurovation (A): A LASSO for Linked Data – 3 Use Cases to improve personal knowledge management, document management and creativity techniques by leveraging Linked Data
Open Data for Enterprises? – My short review of OKCon 2011, Berlin
I went to Berlin last week to attend Open Knowledge Conference 2011 (OKCon). The event was nice – I met great people and “Kalkscheune” is a location which is typical for new media events nowadays: It´s a place where “old industry” took place in former days, “conquered” by digital citizens today. But OKCon is not a typical event for people living in the digital society – it´s rather avant-garde. At OKCon visionaries (like Brewster Kahle) meet and discuss how internet will/should transform society within the next few years towards an open society. Of course not anybody has the same visions about an open society, e.g. some business people might think that this is too much of innovation and they will close up their firewall, their intranet and finally their ears.
Open Data movement is a great thing but it will only be a sustainable project if enterprises will start to build their business on top of it. I gave a talk at OKCon which covered questions like: “How to talk with business people when questions come up like ‘Open Data – what is the value for us?’”. My basic idea is to start with using “Open Data Mechanisms” internally: As a first step companies could establish “Open Data Services” enterprise wide but internally. Only as a sub-sequent step open data will be consumed from the web and finally published on the web. But first of all:
“Use Open Data in Enterprises as an internal mechanism to distribute data within the company”
As a consequence of this, companies will seek for standards based solutions, and voila: W3C can offer the whole range of such standards called “Semantic Web” and “Linked Data”. Have a look at my slides:
Do Controlled Vocabularies Matter?
The results of our survey about controlled vocabularies are published on Issuu.
The most interesting part of the results for me in a few words:
In August 2009 the W3C consortium announced the new SKOS standard – developed by the SWDWG – for bridging between the world of knowledge organization systems and the linked data community. Now, nearly two years after, it looks like this standard has well arrived. 48.7% stated that standards like SKOS are very important and 29.1% voted for “relevant”.
We also asked for other standards that are important for the participant’s daily work. From 250 nominations we decided to focus on those that had a score of at least 7. The big three are OWL, RDF and Dublin Core. Others were a variety of ISO terminologies, RSS, SPIN. Remarkable in this context is that SPARQL made it just to 5 mentions.
A couple of additional insights:
- most of our participants do have a clear awareness about controlled vocabularies and 85,4% are using them in their organization
- the bigger the organization the longer controlled vocabularies are used
- taxonomies and ontologies seem to be the preferred knowledge models
- semantic search, data integration and structure for content navigation are the main application areas for controlled vocabularies
- application areas like recommender systems, autocomplete suggestions and support for multilingual search are not seen as very relevant
- linked data is valued very positively as a future topic
- thesauri will support search engines in the near future to improve search results
- experience with controlled vocabularies is varying considerably among the branches
- there is a high awareness for standards like SKOS; the web-paradigm has entered the world of controlled vocabularies
- controlled vocabularies are no more locked in academic frames, they have also arrived in enterprise areas
Enjoy reading the full report!
Open Data Thesaurus
I’m happy to announce the start of the first “Open Data Thesaurus“. The Open Data Thesaurus is a collection of key concepts and entities, their definitions and semantic links. Following the principle of “eat your own dog food” this thesaurus is available in machine-readable form based on open W3C standards and under a Creative Commons license. The thesaurus can thus be integrated into other applications, such as for creating mashups, or for indexing of documents. The thesaurus is available in English and German.
The thesaurus is maintained and expanded constantly by the Semantic Web Company in cooperation with Open Knowledge Forum Austria.
We warmly invite to co-manage this thesaurus. So please contact Andreas (a.blumauer@semantic-web.at) to get signed up.
Der “Open Data Thesaurus” dient als Einstiegshilfe in die Diskussion um das Themenfeld “offene Daten”: Es wird eine Sammlung zentraler Begriffe und Organisationen, ihrer Definitionen, semantischen Verknüpfungen und weiterführenden Links angeboten. Nach dem Prinzip “eat your own dog food” ist der Thesaurus unter einer Commons Creative Lizenz verfügbar und liegt in maschinen-lesbarer Form auf Basis offener W3C-Standards vor.
Der Thesaurus kann damit in andere Anwendungen eingebunden werden, z.B. zur Erstellung von Mashups oder zur Indexierung von Dokumenten herangezogen werden.
Der Open Data Thesaurus wird von der Semantic Web Company in Zusammenarbeit mit OGD Austria gewartet und laufend erweitert. Der Thesaurus liegt in englischer und deutscher Sprache vor.
“Gerade in der Phase der Etablierung eines Themas sind Thesauri eine wertvolle Ressource, weil Mißverständnisse schneller aufgelöst werden können oder eindeutige Ankerpunkte um oft noch unscharfe Begriffsdefinitionen schneller entstehen können.”
Linked Data for the Masses
Leipziger Semantic Web Days 2011 take place today and tomorrow, and I like this year´s motto: “Linked Data for the Masses”. I think it´s time to dispel the myth that the “semantic web will never become reality”.
Thousands of people including myself have been working on the development of the semantic web in recent years, and just to give a short example of applications and companies which use this mature technology stack in 2011 for various purposes I have prepared a keynote talk I will give tomorrow in Leipzig:
Another way how to make the power of linked data comprehensible to newcomers are short screencasts, take a look at the latest video of the PoolParty team about semantic search:
Some videos, mind maps and podcasts about the Social Semantic Web
In 2010 I gave some talks about hot topics around the Social Semantic Web, and some of them were recorded and streamed over the web (two of them are in German, three in English). All of them were exciting events for me. Again, I learned a lot when preparing the presentations and when listening to the other presentators who I met at the following events:
April 2010 – Digitalks #13 Semantic Web
Auf der Suche nach der besseren Suche: Wie semantische Suche und Semantic Web unsere Arbeitsweisen verändert
August 2010 – GLOBArt 2010
Erwartungen an ein ZukunftsWeb
http://www.ustream.tv/recorded/9039123
September 2010 – ISKO UK Linked Data
PoolParty: SKOS Thesaurus Management utilizing Linked Data
November 2010 – TEDxVienna
Open or closed? Intranets in the Age of the Internet
December 2010 – European Semantic Technology Conference 2010
The role of SKOS in a Web of Data – some business use cases
SKOS Thesaurus Management & Linked Data: Join the upcoming PoolParty Demo Session
The new PoolParty Release 2.8 is available now and offers many new features and improvements:
- Import and export subtrees and concept schemes
- Create sub-properties for relations
- Add notes to concepts (Change/Editorial/History notes)

To get an overview on all changes made in Release 2.8 you can read the Release Notes. The brand-new Quick Start Guide to learn the main aspects of PoolParty in 30 minutes is also available.
Try it out and get a Demo Account or join our next webinar on September 30 to get a deeper insight.
I‐SEMANTICS 2010 Conference Program
This year´s I-Semantics offers a lot of great talks around linked data, semantic search and other hot topics around the big question “how to make the web and applications a little bit smarter once again”. And like every year: It´s the place to go (at least in Central Europe) to meet other people interested and working in the area of the semantic web. Social events at I-Semantics are really cool

