SPARQL in the Helmholtz KG
Once you understand the basics above, you can start writing queries against the Helmholtz-KG using either the QLever or Virtuoso SPARQL endpoints.
Build your first query
Step 1: Define Your "Shorthand" (PREFIX)
URLs in the Semantic Web are very long. Instead of typing https://schema.org/name every time, we create a shortcut called a Prefix .
PREFIX schema: <https://schema.org/>
The core Metadata Prefixes for Helmholtz-KG are listed below. The Helmholtz-KG primarily uses Schema.org for high-level research objects, supplemented by standard W3C vocabularies for cataloging and provenance.
| Prefix | Namespace | Use Case |
|---|---|---|
| schema | https://schema.org/ | Datasets, Software, Persons, Organizations, Events |
| rdf | http://www.w3.org/1999/02/22-rdf-syntax-ns# | Basic RDF types and properties |
| rdfs | http://www.w3.org/2000/01/rdf-schema# | Labels and human-readable descriptions |
| prov | http://www.w3.org/ns/prov# | Provenance, tracking how data was created or modified |
Step 2: Choose Your Variables (SELECT)
Decide what information you want the query to return. Variables are always preceded by a question mark (?).
SELECT ?dataset ?title
Step 3: Map the Pattern (WHERE)
This is the heart of the query. You describe the "Triple" (Subject → Predicate → Object) that you are looking for. End each line with a period (.) unless you are grouping properties for the same subject with a semicolon ';'.
WHERE {
?dataset a schema:Dataset ; # Find things that are Datasets
schema:name ?title ; # Get their names
schema:creator ?person . # Find the creator entity
?person schema:name ?authorName . # Get the name of that creator
}
Step 4: Refine the Results (LIMIT)
Knowledge graphs can be massive. Always start with a LIMIT to avoid crashing your browser with thousands of results. `LIMIT~ keyword limits the results to a specific number.
LIMIT 10
Putting it all Together: A Simple SELECT Query
PREFIX schema: <https://schema.org/>
SELECT ?title ?authorName
WHERE {
?dataset a schema:Dataset ;
schema:name ?title ;
schema:creator ?person .
?person schema:name ?authorName .
}
LIMIT 5
This query retrieves the 5 datasets and their titles and creators. It uses a basic graph pattern to match triples where the subject is a dataset with a name.
Filters and Constraints
Filters and Constraints to narrow down your results. These allow you to search for specific text, compare dates, or exclude certain data.
Filtering by Text (FILTER)
If you want to find items where a title contains a specific word (e.g., "climate"), use the FILTER keyword with CONTAINS
PREFIX schema: <https://schema.org/>
SELECT ?title WHERE {
?dataset a schema:Dataset ;
schema:name ?title .
# Only return titles containing the word "climate" (case-insensitive)
FILTER(CONTAINS(LCASE(?title), "climate"))
}
LIMIT 10
Numerical and Date Constraints
You can use standard mathematical operators like >, <, and = to filter results based on dates or quantities. The following query filter the results to only show datasets published after January 1st, 2023
PREFIX schema: <https://schema.org/>
SELECT ?name ?date WHERE {
?dataset a schema:Dataset ;
schema:name ?name ;
schema:datePublished ?date .
# Only show datasets published after January 1st, 2023
FILTER(?date > "2023-01-01"^^xsd:date)
}
The language
Filters can also checks the language tag of a string.
PREFIX schema: <https://schema.org/>
SELECT ?name ?date WHERE {
?dataset a schema:Dataset ;
schema:name ?name ;
schema:datePublished ?date .
# Only show datasets with English titles
FILTER(LANG(?name) = "en")
}
