Introducing PoolParty GraphSearch: Cognitive Search based on Graphs

What is PoolParty GraphSearch?

With PoolParty GraphSearch companies can search over a variety of content types and business objects and analyze the data on a more granular level. All content and data repositories that are connected to GraphSearch are annotated with semantic metadata that makes the search, recommendation and analytics operations highly precise. GraphSearch is a front-end application put on top of a semantic infrastructure and an API providing the following features:

  • Ontology-based data access (OBDA)
  • Faceted search including hierarchies
  • Autocomplete combined with context information
  • Custom views on entity-centric and document-centric data
  • Statistical charts for the unified data repositories
  • Plug-in system for recommendation and similarity algorithms

How does it work?

Business users query knowledge assets in GraphSearch along data models. As multiple systems can be connected to GraphSearch, the variety of knowledge models are additionally linked by an ontology layer.

System administrators can define which part of the ontology and corresponding entities in the taxonomy should be used in the GraphSearch front-end application. That way, they define specific views on data. They can also provide multiple search spaces within GraphSearch and enable the user to switch between them. A search space is a customized search configuration over a specific data set. The selected search facets for each search space are derived from the knowledge graph.

GraphSearch can be enhanced with recommendation algorithms. These can work with similarity-based recommendations, or for some use cases, a matchmaking algorithm is more suitable. The research team of Semantic Web Company has a strong focus on machine learning and is continuously extending the library of machine learning algorithms in GraphSearch.

Data analytics functionalities support the business user to derive even more granular insights. Search facets can be combined into statistical charts and display which kind of data is actually available for specific topics.

Agile Data Management and Integration

The implementation of PoolParty GraphSearch is the beginning of consolidating data silos without data migration. Various functional roles have to work together in order to deliver a unified data environment. PoolParty takes the heterogeneous technical backgrounds of the involved professionals into consideration.

Specific user-friendly solutions support the whole knowledge management team in their collaborative work processes:

  • Subject matter experts can define a semantic data layer to describe the meaning of metadata in the PoolParty taxonomy management tool.
  • Knowledge engineers can link separate taxonomies and maintain the knowledge graph in the same tool.
  • Information architects and developers can link various content and data repositories with the semantic metadata via the PoolParty API.
  • Data scientists can adapt embedded machine-learning algorithms to finetune the search, classification, and recommendation results that are mainly derived through the knowledge graph.
  • This semi-automatic knowledge engineering approach sustains that the query results will gradually get more precise and applicable to a continuously growing data environment.
  • On top of that, GraphSearch enables business users to search over data repositories and analyze available information.

Want to learn more?

SPARQL is the new King of all Data Scientist’s tools

Inspired by the development of semantic technologies in recent years, in statistical analysis field the traditional methodology of designing, publishing and consuming statistical datasets is evolving to so-called “Linked Statistical Data” by associating semantics with dimensions, attributes and observation values based on Linked Data design principles.

The representation of datasets is no longer a combination of magic words and numbers. Everything is becoming meaningful when URIs replace their positions as dereferencable resources, which further establishes the relations between resources implicitly and automatically. Different datasets are no longer isolated and all datasets share a globally, uniquely and uniformly defined structure.

With “RDF Data Cube Vocabulary” (http://www.w3.org/TR/vocab-data-cube/) there is already a W3C Recommendation available for linked statistical data . At this point, it is time to start building data-oriented applications and services with the traditional statistical computing languages such as R, while benefiting from the omnipotent semantic power of the SPARQL query language.

Most of the statistical analysis functions are set operations performed on subsets of a dataset (i.e., a slice, a facet, etc.). Calculation is dull machine work but how to group and create a subset is actually the innovation point to produce analytics. Thanks to SPARQL, now this subset can be created from a semantic perspective instead of mathematics and statistics.

Compared to the traditional filtering way using SQL queries, SPARQL queries eliminate the boundaries of data among datasets and among databases.

An example query can be “list the bestsellers of a supermarket for category science fiction movie in year 2014”. Someone may point out that it is also feasible with SQL if the database schema consists of all relevant fields. Well, it is absolutely correct. But what if there are more conditions such as “during weekends, directed by an American director, casted by European actors”? Is it necessary for a supermarket to maintain such data sets? Assume that there is a supermarket of which the boss is a movie fans and he would like to maintain such data and SQL is working perfectly so far. Can we reuse this query, and accordingly the Web application for another supermarket? Here we have good reasons why we use SPARQL. Any accessible resource can be used to construct the query results.

SPARQL is the new King of all Data Scientist’s tools because …

  1. SPARQL is close to how human beings actually think about the world.
  2. With SPARQL you can query knowledge graphs.
  3. SPARQL is based on knowledge models that can combine mindsets of subject-matter experts, data engineers and information architects.
  4. SPARQL is to the Semantic Web and the Web in general what SQL is to relational databases.
  5. SPARQL is a W3C recommendation and is supported by many different database vendors, so it doesn’t cause lock-in effects as we’ve become used to with various types of SQL engines (which are not standardized at all).
  6. With SPARQL you benefit from the potential to make a collection of data sources look and query like one big database.
  7. SPARQL provides pattern based search functionality. With such search capabilities you can find out unknown linkages or non-obvious patterns that give you new insights into your data.
  8. Not only is SPARQL a standardized query language, also the access via web interfaces is standardized (this is called a SPARQL endpoint). This makes the integration of different data sources a lot easier.
  9. SPARQL is also a standardized update and graph traversal language.
  10. SPARQL is a standardized protocol producing standardized results, thus making it a complete API alleviating developers from the necessity to reinvent an API with every single application.
  11. With SPARQL you can query over structured and unstructured information as a whole.
  12. SPARQL allows you to explore data. In contrast to traditional ways to query databases where knowledge about the database schema/content is necessary SPARQL allows you to ask “tell me what is there”.
  13. SPARQL property paths offer completely new ways to explore a data set, e.g. by detecting ‘hidden links’ between business objects.
  14. With SPARQL you can define inference rules to gain new information from existing facts.

SKOS as a Key Element in Enterprise Linked Data Strategies

The challenges in implementing linked data technologies in enterprises are not limited to technical issues only. Projects like these deal also with organisational hurdles to be crossed, for instance the development of employee skills in the area of knowledge modelling and the implementation of a linked data strategy which foresees a cost-effective and sustainable infrastructure of high-quality and linked knowledge graphs. SKOS is able to play a key role in enterprise linked data strategies due to its relative simplicity in parallel with its ability to be mapped and extended by other controlled vocabularies, ontologies, entity extraction services and linked open data.

Read the full paper >>>

Sparqling 5 years of semantic web evolution

The Semantic Web School has monitored press releases about semantic technologies and related stuff over the past 5 years. A collection of about 1.200 links and summaries on articles are the result. These articles were tagged and those tags are mainly linked to wikipedia – so this might be an interesting data collection for some web applications…

Inspired by the work of of LinkingOpenData (W3C SWEO) and dbpedia we´re happy to announce that now there is a Sparql endpoint for that data which is open for the public.  Ideas for some mashups are:

  • Showing on a timeline which topics came and went
  • Publishing a list of press releases linked with wikipedia articles
  • Extracting names of companies which are mentioned in the articles and calculating a tag cloud for them

Any other ideas?