Linked Data – The Next 5 Years: From Hype to Action

Linked Data and the Semantic Web have been around for quite a while and have been hyped again and again. In the meantime, a large number of enterprises and even whole industries have adopted semantic web technologies for several purposes (for example, visit Allotrope Foundation). “Gartner’s Hype Cycle 2015 for Advanced Analytics and Data Science” has put Linked Data into the trough of disillusionment, which is another clear indicator to be ready for takeoff.

Gartner 2015 Hype Cycle for Advanced Analytics and Data Science.png

The pace of semantic web technology adoption may vary from industry to industry, but in average it has increased even more than expected. Just in 2012, Gartner has predicted that the Semantic Web won’t reach the plateau of productivity within the next 10 years, only three years later it seems like it will be there in 5 to 10 years.

Gartner 2012 Hype Cycle for Big Data.png

Linked Data Hype or not, it has entered the adoption phase. In the next 5 years we finally can see to which degree enterprises will use semantic web technologies for data analytics, data integration, and knowledge discovery.

What are the main obstacles that are frequently mentioned by potential users? Which best practices for implementing linked data on a larger scale have already been developed? What are the ‘low-hanging fruits’, and how could a concrete action plan look like? Will the often predicted interlinking of an open semantic web and corporate semantic webs take place? Which other technology (of the above mentioned hype cycles) might play a crucial role as an enabler for enterprise linked data? Which other (mega-)trends will influence the pace of linked data adoption, and which related organisational challenges should be expected?

Please visit Andreas Blumauer’s talk ‘Linked Data – The Next 5 Years: From Hype to Action’ at SEMANTiCS 2016 in Leipzig to get some valuable impulses for your Linked Data project!

 

Linked Data 2014: My expectations for the New Year

2014 is only a couple of days old. I have some expectations and visions for the new year with regards to linked data and its next evolution steps.

320px-Fireworks_on_Canada_DAY

  • Smart data will receive a lot of attention: big data is the wave on which this certain topic surfs.
  • Trust and provenance of data has been discussed for a while and has been mentioned frequently to be an important step for linked data to be accepted especially by enterprises. W3C’s PROV ontology was just a first step towards this direction. More specifications and implementations will follow this year.
  • Automatic quality-checks for several types of linked data will become a matter of course (similar to test automation in software testing). One example is qSKOS which is provided as a web service for all people interested in controlled vocabularies like taxonomies or thesauri.
  • The LOD cloud as we know it won’t be updated anymore: the periodical updates of the LOD cloud won’t happen anymore in 2014. The image would be much too big. Instead, several domains will generate their own LOD clouds, each of them with a couple of central hubs in the middle (see also: The LOD cloud is dead, long live the trusted LOD cloud). Those sub-hubs connected will represent the overall LOD cloud in the future. DBpedia will remain in the centre.
  • Traditional database vendors will embrace RDF and SPARQL: MarkLogic Semantics and IBM’s DB2-RDF is just the beginning. It will be hard for them to deliver scalability and performance as good as ‘traditional’ RDF database providers like OpenLink Software or Ontotext can do.
  • Linked Data “Killer applications” will be established: Automatic linking of structured and unstructured information based on RDF could become a killer application for Linked Data technologies. Take a look at two example applications in the areas of medicine and clean energy which make use of this principle: true semantic search will become possible (the two demos wont’t work properly behind the firewall due to some software libraries used by it).
  • The year of semantic web standards: The Open Government Data movement will finally arrive at the point where standards based technologies like linked data become the obvious solution to the more or less chaotic collections of open data which have been accumulated in recent years.
  • Enterprise Linked Data: More and more integrations of linked data technologies like Semantic SP into enterprise platforms like SharePoint will be available as products on the software market.
  • SEMANTICS 2014 will take place in September in Germany and will be a great event. More to come soon.
  • ISWC 2014 will take place in October at beautiful Lake Garda (Italy) and will be a great event, too.
  • I am looking forward to meeting some of you once again, and also to meet some new linked data aficionados!!

Why the term “Linking Open Data” might be misleading

A lot of activities around Linking Open Data (“LOD”) and the associated data sets which are nicely visualised as a “cloud” are going on for quite a while now. It is exciting to see how the rather academic “Semantic Web” and all the work which is associated with this disruptive technology can be transformed now into real business use cases.

What I have obeyed in the last few months, especially in business communities, is the following:

  • “Linked Data” sounds interesting for the business people because the phrase creates a lot of associations in a second or two; also the database crowd seems to be attracted by this web-based approach of data integration
  • “Web of Data” is somehow misleading because many people think that this will be a new web which replaces something else. Same story with the “Semantic Web”
  • “Linking Open Data” sounds dangerous to many companies

For insiders it is clear, that the “openness” of data, especially in commercial settings, can be controlled and has to be controlled in many cases. Which means, it can be one-way or mutual. In some use cases data from companies will be put into the cloud, and can be opened up for many purposes, in other use cases it will stay inside the boundaries. In other scenarios only (open) data from the web will be consumed and linked with corporate data, but no data will be exposed to the world (except the fact, that data was consumed by an entity).

And of course: In many other occasions datasets and repositories will be opened up partly depending on the CCs and the underlying privacy regulations one wants to use.

This makes clear that LOD / Linking Open Data is just one detail of a bigger picture. Since companies (and governments) play a crucial role to develop the whole infrastructure, we need to draw a new picture:

LinkedDataWorld

I´ll be happy to have a lively discussion about this topic also at the first Linked Data Camp in Vienna.

Linked Data: Standing on the shoulders of giants

When Mariano Consens explained at this year´s Triplify Challenge @ I-Semantics in Graz how he built together with Oktie Hassanzadeh the winning project “Linked Movie Database“, one important thing became clear to me: Linked Data isn´t just a playground anymore, no: it´s a very efficient way to build useful applications standing on the shoulders of giants. Congratulations to the winners!

Mariano Consens

Mariano Consens

Become a Web Expert!

The Semantic Web has evolved constantly over the last few years. Nevertheless, in many cases I have experienced a huge demand for profound knowledge in this area. Many potential end-users of semantic web technologies have quite a few ideas of how to apply semantic web, but still many software-projects will never happen, because of the lack of knowledge, because of the fear of getting trapped by too complex technologies. Obviously it´s not the technology anymore but the awareness and personal knowledge about the semantic web, which is the actual bottleneck for the semantic web getting real.

The Semantic Web Company (SWC) is offering in 12 daily seminars a training-course for persons who want to become a Web expert. Participants  get step by step advanced in methods and technologies for semantic projects. Each seminar is a stand-alone-module and can also be booked singularly.

From 27th of Mai to 4th of July 2008 the SWC curriculum will take place. A special focus will be on applications and solutions of semantic technologies to support social processes.
The curriculum will provide profound insights into the topics Semantic Web and Social Software. Therefore the seminars are grouped into three comprehensive modules:

* “Next Step”: Social Web & Social Software
27. – 30. May 2008
* “Advanced Level”: Textmining & Enterprise Search
10. – 13. June 2008
* “Expert View”: Semantic Web & Metadata Management
01. – 4. July 2008

The modules will take place at the Austrian Computer Society in Vienna. The main language in the courses is German. English courses can be provided on demand.

3 Semantic Apps to Watch

As mentioned on Read/WriteWeb there are at least 10 (rather commercial) semantic web applications “around” which claim to use semantic web technologies for different purposes: “10 Semantic Apps to Watch”. (Besides this at least 100 prototypes from various research programms exist in this field).

My “short list” of those 10 apps consists of the following three:

  1. twine
  2. Talis
  3. clearforest

To my opinion these 3 projects have the highest potential to become a “big player” in the next generation web. Instead of “improving” what Google does, they try to fulfill a totally new mission:

twine

Twine isn´t organising the “knowledge of the whole world” (like Google would like to do) they rather focus on the users themselves: Using a semantic graph (including the social graph) for each user, information in a social network will flow in a more efficient way. Information will come to the users instead of searching around. Twine is a combination of many of the well known Web 2.0 applications like Facebook or del.icio.us but will use base technologies from the semantic web and will provide a SPARQL API and a REST API.

Possible Risks:

  • It´s still not clear if people will accept personal semantic graphs rather as an advantage or rather as a possible danger for privacy
  • Semantic Web database technologies (Triple stores) are still very young. Although some of the existing systems have already proved that they are scalable none of them have been used so far for really big systems.

USP:

Twine is the first company which will combine social tagging, social networking, natural language processing and semantic web on a professional level. So it has the potential to become a very popular service for many people to support their daily business. Sooner or later the same system might be offered also as a very attractive business solution. Nevertheless, twine hasn´t opened its portal for the public so far, so it´s still not clear if all the promises will be held…

talis

Talis is a “domain-agnostic” technology platform which supports developers to build applications on the principles of “mass collaboration”. It is a new breed of a distributed programmatic interface heavily deploying all opportunities the Web of Data may offer. “DNS is used as a robust routing mechanism to connect requests with the closest data or service both for the native platform services, but also for third party data access services.

Possible Risks:

Talis mission sounds great, and its success depends a lot on how this company will be able to build an ecosystem around its services. My forecast: Talis will be acquired in 2009 by one of the big web companies.

USP:

Talis tries to establish a new way of organizing information flows throughout the Web of Data. Since it relies on open standard protocols like RESTful Web Services a lot of applications will use Talis technologies. Talis as a company has a well founded background since it has been provided services for governmental organizations or libraries for the last 30 years. Some of the people working at Talis rank among the best semantic web thinkers.

clearforest

(Clearforest was acquired by Reuters) was bought by Thomson. ClearForest’s technology automatically categorizes documents and structures entities contained inside text. The Semantic Web without text extraction algorithms which really work will never take place. And Clearforest really works. Just try it out!

Possible Risks:

Clearforest is well embedded in a giant: Thomson is the world´s largest media company. This is, of course a great opportunity to sell these new kind of semantic solutions to many of the global Top 5000. On the other hand, it might be a risk since “traditional” media companies still tend to forget about the long tail and open APIs.

USP:

Simply spoken, the USP of Clearforest is that the technology works and it can be integrated into existing architectures without being a semantic web expert. It can become one of the cornerstones of an integrated corporate semantic web architecture.

Visiting the cradle of human kind

I spent the last few days in Addis Ababa, Ethiopia. It wasn´t a holiday trip at all, no – it was my first business trip to Africa.

The project I am working for is the “Nile Basin Initiative” (NBI). This partnership was initiated and is led by the riparian states of the Nile River (like Egypt, Ethiopia or Sudan to name just a few). The goal of the NBI is to develop this big, big river in a cooperative manner: The promotion of regional peace and security and the provision of substantial socioeconomic benefits are the most important visions.

For my personal career it was always important what I am working for, so this project is a great challenge and a huge opportunity. Our task is to provide a Web based Information System which will be the knowledge base for all participating stakeholders in the next few years. On top of our system a Decision Support System (DSS) will be implemented, which will help to understand the river system behaviour better and to evaluate alternative development and management schemes. 

We spent two days for our initial workshop to discuss the system specification and it turned out quickly that the specialists at NBI are knowledgeable people. We focused not only on “typical” tasks like document management or full-text search but also on a well structured metadata layer on top of standards like Dublin Core. At the end it also turned out that some semantic web methods will be either implemented at the first stage or it is desirable at least for the next development phase. So a moderated search will be there at system´s first stage and later on an associative semantic search will be implemented.

 

After work was done I spent one day for looking around in Addis Ababa. I visited “Lucy” – the first known individual of partially human type. She lived in Ethiopia 3.2 Million years ago. When I saw her in the museum I thought: If Ethiopia is the cradle of human kind why shouldn´t be our project the cradle for an African Semantic Web, which also helps that the countries in that region start to understand each other a bit better?

Yahoo Researcher Declares Semantic Web Dead – and reborn again…

When Mor Naaman from Yahoo said in a special track on Web 3.0 at WWW2007 that the “Semantic Web” is dead, he obviously tried to attract attention. Nevertheless, in my opinion he is absolutely right – there is no chance to “teach” people to annotate web content in a more sophisticated way than “social tagging” (and I´m pretty sure that also in the future it will always be a small community which will tag their content).

But in one point Mor Naaman missed the point: The “Semantic Web” was always there, under-cover more or less. Living in a tin with a lousy HTML-lid. And inside the tin there has always been enough semantics. There is no need to re-invent the data models, the namespaces, the ontologies (at least for most of the basic “things”) as Naaman proposes in his talk (slide 13). How easily all the existing semantics can be released and mapped against the “Semantic Web” (and suddenly it was born again 😉 ) is demonstrated by projects like [1] or [2].