Skip to main content

SPARQL in the Helmholtz KG

Once you understand the basics above, you can start writing queries against the Helmholtz-KG using either the QLever or Virtuoso SPARQL endpoints.

Build your first query

Step 1: Define Your "Shorthand" (PREFIX)

URLs in the Semantic Web are very long. Instead of typing https://schema.org/name every time, we create a shortcut called a Prefix .

PREFIX schema: <https://schema.org/>

The core Metadata Prefixes for Helmholtz-KG are listed below. The Helmholtz-KG primarily uses Schema.org for high-level research objects, supplemented by standard W3C vocabularies for cataloging and provenance.

PrefixNamespaceUse Case
schemahttps://schema.org/Datasets, Software, Persons, Organizations, Events
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#Basic RDF types and properties
rdfshttp://www.w3.org/2000/01/rdf-schema#Labels and human-readable descriptions
provhttp://www.w3.org/ns/prov#Provenance, tracking how data was created or modified

Step 2: Choose Your Variables (SELECT)

Decide what information you want the query to return. Variables are always preceded by a question mark (?).

SELECT ?dataset ?title

Step 3: Map the Pattern (WHERE)

This is the heart of the query. You describe the "Triple" (Subject → Predicate → Object) that you are looking for. End each line with a period (.) unless you are grouping properties for the same subject with a semicolon ';'.

WHERE {
?dataset a schema:Dataset ; # Find things that are Datasets
schema:name ?title ; # Get their names
schema:creator ?person . # Find the creator entity
?person schema:name ?authorName . # Get the name of that creator
}

Step 4: Refine the Results (LIMIT)

Knowledge graphs can be massive. Always start with a LIMIT to avoid crashing your browser with thousands of results. `LIMIT~ keyword limits the results to a specific number.

LIMIT 10

Putting it all Together: A Simple SELECT Query

PREFIX schema: <https://schema.org/>

SELECT ?title ?authorName
WHERE {
?dataset a schema:Dataset ;
schema:name ?title ;
schema:creator ?person .
?person schema:name ?authorName .
}
LIMIT 5

This query retrieves the 5 datasets and their titles and creators. It uses a basic graph pattern to match triples where the subject is a dataset with a name.

Filters and Constraints

Filters and Constraints to narrow down your results. These allow you to search for specific text, compare dates, or exclude certain data.

Filtering by Text (FILTER)

If you want to find items where a title contains a specific word (e.g., "climate"), use the FILTER keyword with CONTAINS

PREFIX schema: <https://schema.org/>

SELECT ?title WHERE {
?dataset a schema:Dataset ;
schema:name ?title .
# Only return titles containing the word "climate" (case-insensitive)
FILTER(CONTAINS(LCASE(?title), "climate"))
}
LIMIT 10

Numerical and Date Constraints

You can use standard mathematical operators like >, <, and = to filter results based on dates or quantities. The following query filter the results to only show datasets published after January 1st, 2023

PREFIX schema: <https://schema.org/>

SELECT ?name ?date WHERE {
?dataset a schema:Dataset ;
schema:name ?name ;
schema:datePublished ?date .
# Only show datasets published after January 1st, 2023
FILTER(?date > "2023-01-01"^^xsd:date)
}

The language

Filters can also checks the language tag of a string.

PREFIX schema: <https://schema.org/>

SELECT ?name ?date WHERE {
?dataset a schema:Dataset ;
schema:name ?name ;
schema:datePublished ?date .
# Only show datasets with English titles
FILTER(LANG(?name) = "en")
}