ObliquER — Viewsari

ObliquER

An LLM-based entity recognition and linking pipeline designed for oblique, implicit, and long-tail artwork references in Vasari's Lives of the Most Eminent Painters, Sculptors, and Architects.

What it does

Vasari rarely names artworks directly. He describes them, alludes to them, refers back to them with pronouns and definite noun phrases. Traditional named entity recognition fails on this kind of text because the entities are oblique — expressed through description, implication, and coreference rather than explicit naming.

ObliquER tackles this by combining LLM-based extraction with a formal annotation schema that distinguishes four types of artwork mention, each requiring a different recognition strategy. The pipeline produces structured, character-level annotations linked to Wikidata and the Viewsari knowledge graph.

Mention types

Explicit

The artwork is named directly by its title, traditional name, or unambiguous identifier.
"the Maestà", "S. Maria del Fiore"

Implicit

The artwork is described without being named — its identity must be inferred from context, location, subject matter, or attribution.
"a large panel for the monks of Vallombrosa"

Coreferent

An anaphoric or cataphoric reference that points back to an earlier mention — pronouns, demonstratives, definite descriptions.
"that work", "it", "which"

Generic

A non-specific reference to artworks in general, not identifiable as a particular entity.
"works that were passing good"

Pipeline stages

Stage 1

Dynamic chunking

Paragraphs are segmented into context-aware chunks that preserve coreference chains and discourse structure.

Stage 2

Mention extraction

Zero-shot and few-shot LLM prompting with task definitions grounded in the Viewsari ontology to extract typed mentions.

Stage 3

Entity linking

Extracted mentions are linked to Wikidata entities or flagged as out-of-knowledge-base (OOKB) for novel entities.

Stage 4

Coreference clustering

Coreferent mentions are resolved into chains and linked to their antecedent entity mentions.

Technical details

Grounding: Prompt engineering uses formal task definitions from the Viewsari ontology, ensuring consistent annotation across the corpus.

OOKB handling: Artworks not found in Wikidata receive Viewsari-internal IRIs and are flagged for future knowledge base expansion.

Provenance: Every extraction and linking step is recorded as a PROV-O activity in the knowledge graph, enabling full traceability from annotation to source paragraph to bibliographic context.

Evaluation: The pipeline is evaluated against a manually annotated gold-standard corpus covering 16 biographies across 6 volumes.

Try it

Open interactive demo → Artwork explorer →