Skip to content

Conversation

@dimitri-yatsenko
Copy link
Member

Summary

Clarifies the technical distinction between hash-addressed and schema-addressed storage based on their capability to handle different object types.

Changes

Key distinction added:

  • Hash-addressed storage (<blob@>, <attach@>, <hash@>): Handles individual/atomic objects only - single files or serialized blobs
  • Schema-addressed storage (<npy@>, <object@>): Can handle complex multi-part objects - Zarr arrays, HDF5 datasets, directory structures

Files Modified

src/how-to/use-object-storage.md

  • Updated OAS addressing schemes table with new "Object Type" column
  • Added explicit key distinction callout
  • Fixed terminology: inline → in-table storage

src/reference/specs/type-system.md

  • Enhanced <object@> description: "complex, multi-part objects (files, folders, Zarr arrays, HDF5)"
  • Enhanced <hash@> description: "individual, atomic objects only"
  • Clarified that hash-addressed cannot handle Zarr/HDF5

src/llms-full.txt

  • Regenerated to include updated documentation

Technical Background

This distinction is important because:

  1. Zarr arrays are directory structures with multiple chunks/metadata files
  2. Hash-addressed storage stores single blobs identified by content hash
  3. Schema-addressed storage uses hierarchical paths that can contain multiple files
  4. Users choosing between <blob@> and <object@> need to understand this limitation

Related

Addresses feedback about clarifying storage capabilities for complex data structures.

🤖 Generated with Claude Code

- Fixed llms.txt manual reference from migrate-from-0x to migrate-to-v20
- Regenerated llms-full.txt to pick up all corrected migration guide links from PR #107
- Verified no remaining broken internal links in LLM documentation files
- Hash-addressed storage handles individual/atomic objects only (single files/blobs)
- Schema-addressed storage can handle complex multi-part objects (Zarr, HDF5, directories)
- Updated use-object-storage.md OAS table with Object Type column
- Updated type-system.md spec with clearer descriptions
- Fixed terminology: inline → in-table storage
@dimitri-yatsenko
Copy link
Member Author

Consolidated into #119 - Documentation Cohesion Review: Comprehensive Improvements for DataJoint 2.0

@dimitri-yatsenko dimitri-yatsenko deleted the docs/clarify-storage-addressing-distinction branch January 14, 2026 23:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants