TripleModel project plan

This document is the strategic plan for TripleModel (PyPI package triplemodel). ROADMAP.md tracks releases and pyoxigraph coverage (including SM-* SparqlModel integration milestones); ECOSYSTEM.md defines boundaries with SparqlModel. SparqlModel maintainers should copy ECOSYSTEM_SPARQLMODEL.md into that repo.


Current status (0.12.0)

Release-ready (beta) on main: PyPI target triplemodel==0.12.0 — additive SparqlModel 0.13 mapping parity (MultiLangString, TypedLiteral, OntologyRegistry, BackPopulates); prior breaking releases documented in docs/MIGRATION_0.10.md and docs/MIGRATION_0.11.md (see TripleModel roadmap).

Next focus: 1.0.0 — governance, security hardening for parse_url, production semver (see TripleModel roadmap).


Mission

TripleModel is the shared typed Pydantic ↔ RDF mapping library for the ecosystem: correct triples from typed models and pyoxigraph-backed graph I/O (file interchange from 0.4) — without application session or query machinery.

Not the mission: ORM-style persistence, Python-to-SPARQL compilers, HTTP stores, or web frameworks. That is SparqlModel.


Stack and dependency rule

sparqlmodel  →  triplemodel  →  pyoxigraph, pydantic
                  ↑
            (never imports sparqlmodel)

Layer

Package

Stateful?

Application ORM

sparqlmodel

Yes (SPARQLSession)

Mapping / I/O

triplemodel

No (explicit Store in/out)

RDF engine

pyoxigraph

Varies


Design principles

  1. Library-first — usable from ETL, tests, and SparqlModel without a global session.

  2. Orchestrate pyoxigraph — do not reimplement parsers, stores, or SPARQL engines.

  3. One mapping implementation — term conversion and subject-IRI rules live here once; downstream packages must not fork them.

  4. Explicit over magicto_graph / from_graph behavior is documented; merge and null semantics are testable.

  5. Optional heaviness — JSON-LD context extras are install extras, not core deps (SHACL removed in 0.11).

  6. Stable mapping before ORM sugar — prioritize releases that unblock SparqlModel’s triplemodel dependency over duplicating SparqlModel features in TripleModel.


Core dependencies

Package

Role

pydantic

Model validation and field metadata

pyoxigraph

Store, parse/serialize, SPARQL query/update

typing-extensions

Self and typing on Python 3.10

Runtime core stays pydantic + pyoxigraph + typing-extensions (no rdflib).


What TripleModel builds (in scope)

  • Field ↔ predicate mapping (rdf_field, Predicate, future CURIE/Rdf.prefixes)

  • Subject identity (namespace + id, percent-encoding, safe import)

  • Term conversion (XSD, lang tags, custom datatypes; 0.4.1: gYear / partial dates)

  • Stateless graph I/O and sync (add / remove / merge policies)

  • Document formats via pyoxigraph (parse / serialize)

  • Linked-data ergonomics (0.4.1): one-parse multi-class load, Rdf.instance_of for non-rdf:type vocabularies, predicate-URI validation, URI FK → nested model hydration

  • Named graphs (Dataset) where models need contexts

  • Thin SPARQL passthrough (graph.query, optional helpers) — not a Python query DSL

  • Vocabulary helpers (triplemodel.vocab)

  • Stable mapping API for SparqlModel to prototype against from 0.2 (SM-1); semver pin at 0.9–1.0 (SM-5)


What TripleModel does not build (out of scope)

See also ROADMAP.md § Explicitly out of scope.

Area

Owner

SPARQLSession, put/delete cascade, orphan cleanup

SparqlModel

Python Model.field == x query DSL

SparqlModel

SPARQL compiler (WHERE generation from expressions)

SparqlModel

Hydration depth and relationship loading policy

SparqlModel

HttpStore, FastAPI, identity map

SparqlModel

Full OWL reasoning, path algebra, HTML scraping

Other tools

TripleModel may add select_models-style helpers in 0.6 for users who want SPARQL without SparqlModel; SparqlModel remains the home for ergonomic app queries.


Real-world integration lessons (0.4 evaluation)

Exercises in examples/realworld/ showed where TripleModel is already Pythonic (typed fields, parse_file, set keywords, models_to_graph, prefixes) and where setup cost dominates (Wikidata wdt:P31 vs rdf:type, repeated parse_file per class, flat URI foreign keys, XSD gYear, predicate URI mistakes).

Lesson

User pain today

Planned response

Release

One file, many classes

Nobel/DCAT need three parse_file calls on the same TTL

load_models(graph, *classes)

0.4.1 (done)

Wikidata typing

Manual QID lists + from_graph per subject

Rdf.instance_of + discovery

0.4.1 (done)

Cross-resource links

country: str + hand-built dict

ref_field hydration

0.4.1 (done)

Partial dates

foundingDate forced to str

XSD gYear in literal registry

0.4.1 (done)

Mapping footguns

RDFS_LABEL = namespace base breaks import

Class-definition validation on predicate IRIs

0.4.1 (done)

Object graphs in Python

Laureate and Prize are disconnected models

Nested embed + cookbook (inverse optional)

0.4.1 docs; richer 0.7 CBD

Dogfooding

Examples still teach workarounds after APIs ship

Refactor examples/realworld/ (+ snippets) per feature; extend CI tests

0.4.1 (done)

Live endpoint slices

CONSTRUCT refresh script is ad hoc

construct_models + documented refresh recipe

0.6

App-level joins

Country labels need manual joins

hydrate_refs / batch load from graph

0.7

Documentation (no semver bump): Promote real-world patterns into the cookbook; keep examples/realworld/DATA_SOURCES.md as the provenance index.


SparqlModel integration strategy

SparqlModel already pins triplemodel>=0.9,<2 on PyPI. 0.3 wired session I/O through an interim _triple.py dynamic adapter. SparqlModel 0.4 (Option A) adopts SPARQLModel(TripleModel) — one class, direct sync_to_graph / from_graph, delete the adapter.

Integration gates (historical + next)

Milestone

triplemodel / SparqlModel

Outcome

SM-1–SM-5

TripleModel 0.2–0.9 shipped

Mapping APIs available; SparqlModel pins >=0.9,<2

SM-6

SparqlModel 0.4 (Option A)

SPARQLModel subclasses TripleModel; remove _triple.py

SparqlModel 0.5+

Async, file I/O, query production

ORM-only; see SparqlModel ROADMAP

API convergence (canonical: Option A)

SparqlModel (public)

TripleModel (implementation)

SPARQLModel(TripleModel)

Same instances call sync_to_graph, from_graph, to_graph

Field("schema:name")

rdf_field / Predicate at class creation

__prefixes__ / rdf_type

nested class Rdf (prefixes, type_uri, embed, IriId)

id: IRI

IriId / explicit IRI id field

session.put

sync_to_graph + SparqlModel cascade (orchestration in graph.py)

Integrator requirements (SM-6 / SparqlModel 0.4)

TripleModel must support subclassing without breaking:

  • register_rdf_resource on SPARQLModel subclasses

  • Nested embed='iri' for composition (SparqlModel cascade policy wraps this)

  • IriId for explicit id: IRI fields

  • Stable sync_to_graph / from_graph on the subclass instance

Contract tests (future)

  • Cross-repo or published-wheel tests: SparqlModel put triple set equals TripleModel sync + cascade rules.

  • TripleModel owns literal/subject bugs; SparqlModel owns compiler/session bugs.


Release philosophy

Phase

Versions

Goal

Foundation

0.1.x

Flat round-trip, CI, typing, docs

Model-complete

0.2–0.3

Fields, sync, namespaces, literals, blanks, lists — SparqlModel gate

Document I/O

0.4

Files (SHACL removed 0.11)

Real-world ergonomics

0.4.1

Multi-class load, Wikidata typing, XSD dates, mapping validation

Graph contexts

0.5

Dataset / Trig

Query passthrough

0.6

SPARQL helpers on Store (not ORM)

Algorithms

0.7

CBD, isomorphism, RDFS import helpers

Scale

0.8

Store extras, batch import

Freeze

0.9

Matrix audit, API stable for downstream

Production

1.0

Governance, security docs, no new surface

Patch releases: bugfixes only. Minors: features. Majors: breaking API after 1.0.


Priority order (when trade-offs arise)

  1. Correctness — subject IRIs, literals, import/export symmetry.

  2. SparqlModel gate items — sync/remove (0.2), namespaces (0.2), nested models (0.2).

  3. Real-world ergonomics (0.4.1) — multi-class load, property-based typing, mapping validation, partial XSD dates (unblocks LOD examples without Dataset).

  4. pyoxigraph matrix — per ROADMAP.md (0.5+ Dataset, 0.6 SPARQL passthrough).

  5. Ergonomic extras — codegen, hydrate_refs, advanced SPARQL helpers.

  6. Never — session/query compiler in TripleModel core.


Documentation map

Document

Audience

README on GitHub

Library users

ROADMAP.md

Releases, pyoxigraph matrix

PLAN.md

Strategy (this file)

ECOSYSTEM.md

triplemodel ↔ SparqlModel boundaries

ECOSYSTEM_SPARQLMODEL.md

Copy into SparqlModel repo


Success metrics

  • 0.2: SparqlModel can prototype triplemodel for model_to_graph / load without losing put semantics.

  • 0.4: Load/save Turtle/JSON-LD without SparqlModel-only parsers.

  • 0.4.1: Nobel + DCAT examples use a single graph load; Wikidata capitals avoid hard-coded QID loops; Schema.org gYear imports without str workarounds; invalid rdf_predicate fails at class definition; in-repo examples updated to match each shipped API (examples/realworld/, relevant snippets, test_realworld_examples.py).

  • 0.9: SparqlModel pins released triplemodel (shipped).

  • SM-6 / SparqlModel 0.4: SPARQLModel(TripleModel); interim adapter removed.

  • 1.0: Downstream apps choose triplemodel for pipelines and sparqlmodel for apps — clear docs, no overlap confusion.