ObliquER
An LLM-based entity recognition and linking pipeline designed for oblique, implicit,
and long-tail artwork references in Vasari's Lives of the Most Eminent Painters,
Sculptors, and Architects.
What it does
Vasari rarely names artworks directly. He describes them, alludes to them, refers back
to them with pronouns and definite noun phrases. Traditional named entity recognition
fails on this kind of text because the entities are oblique —
expressed through description, implication, and coreference rather than explicit naming.
ObliquER tackles this by combining LLM-based extraction with a formal annotation schema
that distinguishes four types of artwork mention, each requiring a different recognition
strategy. The pipeline produces structured, character-level annotations linked to
Wikidata and the Viewsari knowledge graph.
Mention types
Explicit
The artwork is named directly by its title, traditional name, or unambiguous identifier.
"the Maestà", "S. Maria del Fiore"
Implicit
The artwork is described without being named — its identity must be inferred
from context, location, subject matter, or attribution.
"a large panel for the monks of Vallombrosa"
Coreferent
An anaphoric or cataphoric reference that points back to an earlier mention —
pronouns, demonstratives, definite descriptions.
"that work", "it", "which"
Generic
A non-specific reference to artworks in general, not identifiable as a particular
entity.
"works that were passing good"
Pipeline stages
Stage 1
Dynamic chunking
Paragraphs are segmented into context-aware chunks that preserve
coreference chains and discourse structure.
Stage 2
Mention extraction
Zero-shot and few-shot LLM prompting with task definitions grounded
in the Viewsari ontology to extract typed mentions.
Stage 3
Entity linking
Extracted mentions are linked to Wikidata entities or flagged as
out-of-knowledge-base (OOKB) for novel entities.
Stage 4
Coreference clustering
Coreferent mentions are resolved into chains and linked to their
antecedent entity mentions.
Technical details
Grounding: Prompt engineering uses formal task definitions from the
Viewsari ontology, ensuring consistent annotation across the corpus.
OOKB handling: Artworks not found in Wikidata receive Viewsari-internal
IRIs and are flagged for future knowledge base expansion.
Provenance: Every extraction and linking step is recorded as a
PROV-O activity in the knowledge graph, enabling full traceability from annotation
to source paragraph to bibliographic context.
Evaluation: The pipeline is evaluated against a manually annotated
gold-standard corpus covering 16 biographies across 6 volumes.
Try it
Open interactive demo →