Skip to content

Add automatic substructure for dense nodes (fixes #614)#4532

Open
ADunfield wants to merge 1 commit intoneo4j-contrib:5.26from
ADunfield:feature/614-dense-node-substructure
Open

Add automatic substructure for dense nodes (fixes #614)#4532
ADunfield wants to merge 1 commit intoneo4j-contrib:5.26from
ADunfield:feature/614-dense-node-substructure

Conversation

@ADunfield
Copy link
Copy Markdown

Summary

Implements #614: automatic multi-level B-tree substructure for dense (supernode) nodes with millions of relationships. Distributes relationships across intermediate __DenseBucket nodes to reduce both read fan-out and write lock contention.

This has been an open request since 2017. The implementation went through three rounds of peer review with systematic fixes between each round. Addressed all Round 3 feedback: indexed delete O(log n), streaming analyze, schema transaction safety, 27 tests.

cc @jexp — would appreciate your eyes on this given the scope.

Architecture

When a node has millions of relationships of a given type, this module distributes them across a configurable N-ary tree of bucket nodes:

[Dense Node]
     |
{_DENSE_META_LIKES}     ← one per type+direction, carries metadata
     |
[Root Bucket]           ← __DenseBucket label
   / | \
{_DENSE_BRANCH_LIKES}
 /   |   \
[Bucket] [Bucket] ...  ← leaf buckets, capacity configurable (default 1000)
 /|\       /|\
{_DENSE_LIKES}          ← prefixed to avoid Cypher collision
/ | \
[target nodes]

Key Design Decisions

Prefixed relationship types (_DENSE_META_<TYPE>, _DENSE_BRANCH_<TYPE>, _DENSE_<TYPE>): Prevents silent Cypher query breakage. After migration, MATCH (n)-[:LIKES]->(t) correctly returns 0 results rather than silently missing the substructure — users must go through the apoc.dense.* API. This was identified as the highest-risk item in Round 1 review.

Deadlock-safe locking: Uses optimistic double-checked locking on the fast path (lock bucket only, re-check count). On the split path when a bucket is full, acquires locks in strict hierarchical order: meta-relationship → root → branch → leaf. This prevents the A→B / B→A deadlock pattern that would arise from concurrent writers.

Indexed delete via __dense_target: Each leaf relationship stores the target node's element ID as a property. Combined with the __DenseBucket.__dense_source node index, delete finds the right bucket in O(buckets_for_source) worst case instead of O(total_relationships) linear scan. Worst case occurs when the same source→target pair has duplicate rels across buckets; for typical unique-target workloads this is effectively O(1) per bucket.

Schema transaction safety: Index creation uses a read-only check in a separate transaction first, then DDL in its own committed schema transaction. Catches EquivalentSchemaRuleAlreadyExistsException from concurrent creation gracefully.

Direction.BOTH semantics: Rejected on writes with IllegalArgumentException and clear guidance to call twice. Supported on reads as union (relationships) or sum (degree).

Element ID portability caveat: __dense_target uses Neo4j element IDs which are stable within a database lifecycle but NOT portable across dump/restore. After restore, the recovery path is flatten()migrate() to rebuild with fresh IDs. Documented in class javadoc.

autoMigrate warning behavior: When autoMigrate: true, the procedure emits a log.warn() to neo4j.log when a node exceeds the dense threshold. It does not silently migrate or throw — operators decide when to run apoc.dense.migrate().

Alternatives Considered

Five storage strategies were evaluated:

Approach Write Delete Read Verdict
B-tree (selected) O(1) fast path O(buckets) indexed O(log n + results) Best balanced
Hash-routed flat buckets O(1) direct O(1) hash O(N buckets) all-scan Write-heavy only
Probabilistic skip-list O(log n) expected O(log n) expected O(log n) expected Disk-unfriendly
Append-only LSM O(1) append Tombstone + GC Degrades pre-compaction Write-only use case
Adaptive branching Same as B-tree Same as B-tree Same as B-tree Tuning layer, not different structure

B-tree was selected for balanced read/write performance, clean mapping to Neo4j's node/relationship model, and deterministic worst-case guarantees.

API

Write Procedures (OUTGOING or INCOMING only)

Procedure YIELD
apoc.dense.create.relationship(src, type, tgt, props?, config?) rel, bucket
apoc.dense.create.relationship.incoming(src, type, tgt, props?, config?) rel, bucket
apoc.dense.delete.relationship(src, type, tgt, matchProps?) removed, remainingCount
apoc.dense.migrate(src, type, dir?, config?) migratedCount, bucketsCreated, levels, migrationComplete
apoc.dense.flatten(src, type, dir?, config?) flattenedCount, bucketsRemoved

Read Procedures (BOTH supported)

Procedure/Function YIELD
apoc.dense.relationships(src, type, dir?, config?) rel, node, cursor
apoc.dense.degree(src, type, dir?) Long (O(1) metadata)
apoc.dense.analyze(config?) node, type, direction, degree, alreadyManaged
apoc.dense.status(src, type, dir?) type, direction, totalCount, levels, bucketCount, ...

Config Parameters

Key Default Description
bucketCapacity 1000 Max leaf relationships per bucket
branchFactor 100 Max child buckets per branch node
denseThreshold 10000 Degree threshold for analyze/auto-detect
batchSize 5000 Relationships per batch in migrate/flatten
autoMigrate false Warn via log if direct rels exceed threshold
limit 0 (unlimited) Max results for query
cursor null Resume token for pagination
sampleRate 1.0 Probabilistic sampling for analyze (0.0–1.0)
analyzeLimit 500 Hard cap on analyze results

Files

Source (5 files, ~1,800 lines):
  Dense.java            — 8 procedures + 1 function, @Extended annotated
  DenseConfig.java      — Config POJO with type-safe converters
  DenseConstants.java   — Labels, rel type prefixes, property keys
  DenseNodeManager.java — Core B-tree logic, locking, indexed delete
  DenseResult.java      — 7 result types

Tests (1 file, ~870 lines):
  DenseTest.java        — 27 integration tests

Test Coverage

  • Create: single, with properties, bucket fill, bucket split, multi-level tree
  • Degree: normal, zero for non-dense, BOTH direction sum
  • Query: full traversal, limit, cursor-based pagination
  • Delete: basic, property matching, non-existent, empty bucket compaction, indexed lookup across multiple buckets
  • Analyze: detection, label filter, limit cap, bucket node filtering
  • Migrate: full, batched, BOTH direction rejection
  • Flatten: reverse migration, internal property stripping
  • Status: normal, empty for non-dense
  • Concurrency: 4-thread concurrent writes with metadata integrity verification

v2 Enhancements (deferred)

  • Bucket rebalancing: Merging partially-empty sibling buckets after heavy delete. Deferred because it violates the strict top-down lock hierarchy.
  • Native relationship property index: Neo4j 5 supports relationship property indexes. Creating CREATE INDEX FOR ()-[r:_DENSE_LIKES]-() ON (r.__dense_target) would enable O(1) delete. Deferred because it requires per-type lazy index creation.

Relationship ID Contract

WARNING: Relationship element IDs are NOT preserved across migrate() or flatten(). These procedures delete and recreate relationships. This is consistent with apoc.refactor.* behavior.

Implements multi-level B-tree bucket procedures for managing nodes with
millions of relationships. Distributes relationships across intermediate
__DenseBucket nodes to reduce read fan-out and write lock contention.

Procedures: apoc.dense.create.relationship, delete.relationship,
relationships, analyze, migrate, flatten, status.
Function: apoc.dense.degree (O(1) metadata count).

Design decisions:
- Prefixed relationship types (_DENSE_META_, _DENSE_BRANCH_, _DENSE_)
  to prevent silent Cypher query breakage after migration
- Optimistic fast-path locking with strict hierarchical ordering on
  split path (meta-rel -> root -> branch -> leaf) for deadlock safety
- __dense_target property on leaf rels for O(buckets_for_source) delete
  instead of O(total_relationships) linear scan
- Schema DDL in separate committed transaction with concurrent-creation
  exception handling
- Direction.BOTH rejected on writes, returns union on reads
- Streaming analyze() with sampleRate + analyzeLimit to prevent OOM
- Cursor-based pagination for relationship queries
- Element ID portability caveat documented (dump/restore requires
  flatten + re-migrate)

27 integration tests covering create, delete, query, migrate, flatten,
status, analyze, concurrent writes, direction handling, and indexed
delete correctness.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant