RDF Canonicalization
Deterministic serialization and semantic equality testing.
Sign, cache, and compare RDF graphs reliably.
The Problem
RDF graphs with blank nodes can be serialized in countless different ways while representing the exact same information.
Document 1
_:alice <foaf:name> "Alice" .
_:alice <foaf:knows> _:bob .
_:bob <foaf:name> "Bob" . Document 2
_:person1 <foaf:name> "Alice" .
_:person1 <foaf:knows> _:person2 .
_:person2 <foaf:name> "Bob" . Same information, different blank node labels. String comparison fails. Object equality fails. How do you:
- Digitally sign RDF data?
- Use RDF graphs as cache keys?
- Detect if two graphs are semantically identical?
- Synchronize RDF data reliably?
The Solution
RDF Canonicalization transforms any RDF graph into a single, deterministic representation.
Both Documents Produce Identical Canonical Form
_:c14n0 <foaf:name> "Alice" .
_:c14n0 <foaf:knows> _:c14n1 .
_:c14n1 <foaf:name> "Bob" .
β Deterministic blank node labels
β Consistent ordering
β Reliable for signatures and comparison
Getting Started
Install the package
dart pub add locorda_rdf_canonicalization
dart pub add locorda_rdf_core # For creating RDF graphs import 'package:locorda_rdf_canonicalization/canonicalization.dart';
import 'package:locorda_rdf_core/core.dart';
void main() {
// Two Turtle/N-Triples documents with identical semantic content
// but different blank node labels
final turtle1 = '''
_:alice <http://xmlns.com/foaf/0.1/name> "Alice" .
_:alice <http://xmlns.com/foaf/0.1/knows> _:bob .
_:bob <http://xmlns.com/foaf/0.1/name> "Bob" .
''';
final turtle2 = '''
_:person1 <http://xmlns.com/foaf/0.1/name> "Alice" .
_:person1 <http://xmlns.com/foaf/0.1/knows> _:person2 .
_:person2 <http://xmlns.com/foaf/0.1/name> "Bob" .
''';
// Parse both documents
final graph1 = turtle.decode(turtle1);
final graph2 = turtle.decode(turtle2);
// They are different as strings and objects
print('Strings identical: ${turtle1 == turtle2}'); // false
print('Objects equal: ${graph1 == graph2}'); // false
// But they are semantically equivalent (isomorphic)
print('Isomorphic: ${isIsomorphicGraphs(graph1, graph2)}'); // true
// Canonicalization produces identical output
final canonical1 = canonicalizeGraph(graph1);
final canonical2 = canonicalizeGraph(graph2);
print('Canonical identical: ${canonical1 == canonical2}'); // true
} import 'package:locorda_rdf_canonicalization/canonicalization.dart';
import 'package:locorda_rdf_core/core.dart';
void main() {
// Actually, RDF Canonicalization is defined for RDF Datasets, so
// this is how it looks with N-Quads and thus Datasets
final nquads1 = '''
_:alice <http://xmlns.com/foaf/0.1/name> "Alice" .
_:alice <http://xmlns.com/foaf/0.1/knows> _:bob .
_:bob <http://xmlns.com/foaf/0.1/name> "Bob" .
_:alice <http://xmlns.com/foaf/0.1/age> "30" <http://example.org/graph1> .
''';
final nquads2 = '''
_:person1 <http://xmlns.com/foaf/0.1/name> "Alice" .
_:person1 <http://xmlns.com/foaf/0.1/knows> _:person2 .
_:person2 <http://xmlns.com/foaf/0.1/name> "Bob" .
_:person1 <http://xmlns.com/foaf/0.1/age> "30" <http://example.org/graph1> .
''';
// Parse both documents
final dataset1 = nquads.decode(nquads1);
final dataset2 = nquads.decode(nquads2);
// They are different as strings and objects
print('Strings identical: ${nquads1 == nquads2}'); // false
print('Objects equal: ${dataset1 == dataset2}'); // false
// But they are semantically equivalent (isomorphic)
print('Isomorphic: ${isIsomorphic(dataset1, dataset2)}'); // true
// Canonicalization produces identical output
final canonical1 = canonicalize(dataset1);
final canonical2 = canonicalize(dataset2);
print('Canonical identical: ${canonical1 == canonical2}'); // true
} import 'package:locorda_rdf_canonicalization/canonicalization.dart';
import 'package:locorda_rdf_core/core.dart';
void main() {
// Canonicalization involves cryptographic operations, so it's
// important to pre-compute canonical forms for efficient comparison
// and avoid repeated work.
final graphs = <RdfGraph>[];
// Create multiple graphs for comparison
for (int i = 0; i < 100; i++) {
final graph = RdfGraph(triples: [
Triple(BlankNodeTerm(), const IriTerm('http://example.org/id'),
LiteralTerm.string('$i')),
]);
graphs.add(graph);
}
// Wrap into CanonicalRdfGraph. A CanonicalRdfGraph will lazily compute the
// canonical form on first access and cache it. Subsequent accesses to
// the canonical form will be O(1).
final canonicalGraphs = graphs.map((g) => CanonicalRdfGraph(g)).toList();
// Now comparisons are O(1) string comparisons
// instead of expensive graph isomorphism
for (int i = 0; i < canonicalGraphs.length; i++) {
for (int j = i + 1; j < canonicalGraphs.length; j++) {
if (canonicalGraphs[i] == canonicalGraphs[j]) {
print('Graphs $i and $j are isomorphic');
}
}
}
} Use Cases
π Digital Signatures
Sign RDF data reliably. Canonical forms ensure the same graph always produces the same signature, regardless of serialization order or blank node labels.
πΎ Caching & Deduplication
Use canonical forms as consistent cache keys. Detect duplicate graphs even when they're serialized differently.
π Data Synchronization
Detect changes in RDF datasets reliably. Compare canonical forms to know if data has actually changed, not just been re-serialized.
βοΈ Graph Comparison
Test semantic equality between different RDF representations. Isomorphism testing made simple and efficient.
π Compliance
Meet requirements for deterministic RDF serialization. W3C standards-compliant implementation.
π§ͺ Testing
Write reliable RDF tests. Compare expected and actual graphs semantically, not syntactically.
Key Features
Graph & Dataset Support
Canonicalize both simple RDF graphs and complex datasets with named graphs. Full support for quads and multiple graph contexts.
Deterministic Blank Nodes
Hash-based blank node labeling algorithm ensures the same graph structure always gets the same blank node identifiers.
Optimized Performance
Efficient algorithms with O(1) equality comparison when using cached canonical forms. Perfect for large-scale graph comparison.
Configurable Hashing
Choose between SHA-256 (faster) and SHA-384 (more secure) based on your needs. Custom blank node prefixes supported.
Standards Compliant
Implements the W3C RDF Dataset Canonicalization specification. Interoperable with other compliant implementations.
Seamless Integration
Works perfectly with locorda_rdf_core. Parse any RDF format, canonicalize, and serialize back to any format.
Core API
canonicalize()
Canonicalize an RdfDataset to N-Quads string
final canonical = canonicalize(dataset); canonicalizeGraph()
Canonicalize an RdfGraph to N-Quads string
final canonical = canonicalizeGraph(graph); isIsomorphic()
Test if two RdfDatasets are semantically equivalent
if (isIsomorphic(dataset1, dataset2)) {
print('Semantically identical!');
} CanonicalRdfGraph
Cached canonical representation for efficient comparison
final canonical = CanonicalRdfGraph(graph);
// O(1) equality comparison
if (canonical1 == canonical2) { ... } Ready to Canonicalize?
Start using deterministic RDF serialization in your Dart projects.