Introducing PoolParty GraphSearch: Cognitive Search based on Graphs

What is PoolParty GraphSearch?

With PoolParty GraphSearch companies can search over a variety of content types and business objects and analyze the data on a more granular level. All content and data repositories that are connected to GraphSearch are annotated with semantic metadata that makes the search, recommendation and analytics operations highly precise. GraphSearch is a front-end application put on top of a semantic infrastructure and an API providing the following features:

  • Ontology-based data access (OBDA)
  • Faceted search including hierarchies
  • Autocomplete combined with context information
  • Custom views on entity-centric and document-centric data
  • Statistical charts for the unified data repositories
  • Plug-in system for recommendation and similarity algorithms

How does it work?

Business users query knowledge assets in GraphSearch along data models. As multiple systems can be connected to GraphSearch, the variety of knowledge models are additionally linked by an ontology layer.

System administrators can define which part of the ontology and corresponding entities in the taxonomy should be used in the GraphSearch front-end application. That way, they define specific views on data. They can also provide multiple search spaces within GraphSearch and enable the user to switch between them. A search space is a customized search configuration over a specific data set. The selected search facets for each search space are derived from the knowledge graph.

GraphSearch can be enhanced with recommendation algorithms. These can work with similarity-based recommendations, or for some use cases, a matchmaking algorithm is more suitable. The research team of Semantic Web Company has a strong focus on machine learning and is continuously extending the library of machine learning algorithms in GraphSearch.

Data analytics functionalities support the business user to derive even more granular insights. Search facets can be combined into statistical charts and display which kind of data is actually available for specific topics.

Agile Data Management and Integration

The implementation of PoolParty GraphSearch is the beginning of consolidating data silos without data migration. Various functional roles have to work together in order to deliver a unified data environment. PoolParty takes the heterogeneous technical backgrounds of the involved professionals into consideration.

Specific user-friendly solutions support the whole knowledge management team in their collaborative work processes:

  • Subject matter experts can define a semantic data layer to describe the meaning of metadata in the PoolParty taxonomy management tool.
  • Knowledge engineers can link separate taxonomies and maintain the knowledge graph in the same tool.
  • Information architects and developers can link various content and data repositories with the semantic metadata via the PoolParty API.
  • Data scientists can adapt embedded machine-learning algorithms to finetune the search, classification, and recommendation results that are mainly derived through the knowledge graph.
  • This semi-automatic knowledge engineering approach sustains that the query results will gradually get more precise and applicable to a continuously growing data environment.
  • On top of that, GraphSearch enables business users to search over data repositories and analyze available information.

Want to learn more?

10 Business Solutions based on Linked Data and Semantic Technologies

2013: Linked data technologies have matured, linked data community has grown, interest both from private industry and from the government is considerable.

A key questions that I get asked again and again is: which business solutions can be built based on linked data? Here is ‘our’ list of 10 solutions:

The PoolParty approach for efficient knowledge modeling

The PoolParty approach for efficient knowledge modeling is based on methods from

  • text analytics and text mining
  • linked data management
  • SKOS thesaurus modeling
  • ontology engineering and
  • semantic wikis

and recombines these techniques to a unique approach to create complex knowledge models which can be further used for all of the above mentioned tasks, semantic search, and knowledge discovery in big data sets.

Using DBpedia to generate SKOS thesauri

In recent years, we have constantly discussed the application of thesauri and other knowledge models to improve search. Many people understand that thesaurus based search is in many cases better than search algorithms purely based on statistics. Of course the big contra always was, “the costs are too high to establish a good-enough thesaurus or even a high-quality one”.

Imagine you could generate any thesaurus you would like for nearly any knowledge domain you can think of with quite a good quality! Sounds impossible? Reminds you of all the promises made by text mining software which generates “semantic nets” from scratch?

Here at the Semantic Web Company we have been working on SKOSsy for a while. I will explain what this web service can do for you:

SKOSsy generates SKOS based thesauri in German or in English for a domain you are interested in. SKOSsy extracts data from DBpedia, so it can cover anything which is in DBpedia. Thus, SKOSsy works well whenever a first seed thesaurus should be generated for a certain organisation or project. If you load the automatically generated thesaurus into an editor like PoolParty Thesaurus Manager (PPT) you can start to enrich the knowledge model by additional concepts, relations and links to other LOD sources. But you don´t have to start in the open countryside with your thesaurus project.

With SKOSsy in place custom-tailored text extractors can be produced with low effort. To sum up,

  • SKOSsy makes heavy use of Linked Data sources, especially DBpedia
  • SKOSsy can generate SKOS thesauri for virtually any domain within a few minutes
  • Such thesauri can be improved, curated and extended to one´s individual needs but they serve usually as “good-enough” knowledge models for any semantic search application you like
  • SKOSsy based semantic search usually outperform search algorithms based on statistics since they contain high-quality information about relations, labels and disambiguation
  • SKOSsy works perfectly together with PoolParty product family

Which domains are you interested in? Give it a try!

My first experiences with Twine

Today finally I logged in to Twine the first time. I was reading yesterday about some shortcomings of the system, so I was keen on trying out the system by myself to get my own impression.

It´s true that the system isn´t as easy to understand as del.icio.us or other bookmarking tools. It takes a while until you get used to all those additional ways you can navigate through the system. Remember: “Twine looks at content and parses it automatically for the names of people, places, organizations and other subject tags. Users are then able to navigate between related content, view recommended content and connect with recommended people with related interests.” – But the “shortcoming” mentioned by Marshall Kirkpatrick that “… it’s hard to keep track of all the levels and types of information available” I can´t agree with: This has only to do with a general problem, which arises whenever semantic technologies should enhance the user experience. Either you stay with “simple” user-interfaces like Google or del.icio.us or you spend 5 minutes or so to learn a new piece of software which will help you to save time in the future and which helps you to find related information automatically.
On the other hand I was very surprised, that the automatic recommendations Twine makes on how to annotate or describe a new resource is really unsatisfying. Users will only spend time to tag their bookmarks if the machine comes up with some intelligent suggestions. And it´s true, as Marshall says, “most of the web is made up of ugly, non-standard pages.”

So hopefully Twine will add that feature before it will open up to the public (isn´t there a plan to integrate OpenCalais or something similar?), otherwise there will be no “first mainstream semantic web application” but only another prototype of a yet another semweb-app.

OpenCalais will become an essential part of the Semantic Web

Really large companies start to spur the semantic web. Reuters has recently launched a semantic web service which is free also for commercial purposes. It helps to extract significant phrases from any unstructured text (web documents or office documents). This new service is called “OpenCalais” and is based on ClearForest text-analytics solutions (which was acquired by Reuters in 2007). So finally a dream comes true: Web content can be tagged automatically in quite a high quality. Technically spoken: Any unstructured text can be transformed into an RDF-graph on the fly, important phrases or even statements can be extracted from plain text.

OpenCalais is the core service for many new web applications and most of them will deal with better search functionalities or will also help to identify similarities between different types of content. For instance, for any document which is published on a web site related blogs or videos (or whatever) can be retrieved and presented as relevant context information.

Whenever an application will use OpenCalais content will be delivered to Reuters. Thus, submitting a URL has a different meaning in the future than it had all the years before: It´s not only about “promoting” a website anymore, it´s rather about examining ways to get connected with the semantic web – and about teaching Reuter´s global knowledge base 😉

Try it out!