Viewsari · Lives of the artists

Viewsari

A provenance-aware knowledge graph built from Giorgio Vasari's Le Vite de' piu eccellenti pittori, scultori, e architettori (1568), developed at KIT / FIZ Karlsruhe (ISE group) as part of a doctoral dissertation.

Explore the artworks Browse the biographies What's behind it?

Vasari's Lives (full corpus)

165

Biographies

3,479

Paragraphs

Volumes

Knowledge graph

2.6M

Facts (triples)

15,804

Artworks found

3,070

Wikidata links

Knowledge-graph figures reflect the current populated graph.

Knowledge extracted from interpretive texts is not discovered but constructed; therefore, knowledge graphs must model the construction process itself.

Cultural heritage materials, once digitized, remain largely unstructured: human-readable, but not systematically queryable, linkable, or analyzable. Large language models have improved access to such content, yet they cannot guarantee epistemic traceability. Viewsari addresses this gap by treating knowledge extraction from historical text as an interpretive activity, recording agents, prompts, software versions, and source contexts for each extracted statement.

The approach is neuro-symbolic: the Viewsari ontology (symbolic component) provides a formal schema that structures and constrains extraction tasks assigned to a large language model (neural component). Vasari's Lives (1568) serves as testbed — a founding document of art history characterized by implicit entity references, attributions presented as facts, and long-tail entities absent from standard knowledge bases.

Research Questions

The three questions this dissertation set out to answer. Click each one to see the short answer and where to explore it on this site.

RQ1 What representational commitments must a knowledge graph make to remain accountable to the interpretive act that produced it?

The Viewsari ontology formalizes a three-layer model covering bibliographic structure, document components, and extracted entities with full PROV-O provenance. Every mention, entity, and extraction run is traceable back to its source paragraph and the agent that produced it.

Ontology overview WIDOCO documentation Knowledge base SPARQL endpoint

RQ2 Under what conditions can large language models recognize and link the implicit and out-of-knowledge-base (OOKB) mentions that surface form-baselines miss?

The ObliquER pipeline uses LLM-based named entity recognition with few-shot and ontology-guided prompting strategies. Extraction runs are recorded as per-paragraph PROV activities with full prompt text, timestamps, and model provenance. Entity linking from the ground truth maps mentions to Wikidata or OOKB identifiers.

ObliquER pipeline Artwork explorer Interactive demo Annotation corpus

RQ3 What reusable ontology design pattern emerges from modeling extraction as a documented, provenance-aware interpretive process, and how does that foundation transfer to structurally distinct interpretive corpora?

A comparative analysis across digital-humanities knowledge graphs isolates a reusable pattern specific to generative extraction — Computational Provenance (COMP-PROV). A transferability pilot then reconfigures the same ontology and ObliquER pipeline for a 19th-century German documentary corpus (on the Illuminati order, linked to the FactGrid Wikibase) by changing only the prompt content — templates, post-processing, and ontology core stay fixed.

Knowledge base (OOKB entities) Artwork explorer (from the original text) Publications

Knowledge Graph Preview