Skip to content

Conversation

@CraigMacomber
Copy link
Contributor

@CraigMacomber CraigMacomber commented Dec 2, 2025

Description

Since simple schema can now contain view schema specific data, but might also be derived from stored schema, and might also be storing only a subset of its view specific data that it can hold (like when snapshotting for compatibility tests) using it is rather error prone.

This refactoring attempts to better clarify what the semantics of a given simple schema are by allowing them to be typed as either view or stored, and providing more specific names and docs to the compatibility snapshotting APIs.

This also refactors how we generate stored schema: now we transform a simple schema for a view to a simple schema for stored schema, then convert that to the persisted format. This redesign separates the concerns for persisted format conversion and the semantics of things like staged schema which get processes when converting from view to stored.

This results in some deduplication of logic, and allows all schema transformation logic to be applied directly to simple-schema.
This change could be followed up with some further changes to better remove stored schema from the alpha APIs, and replace those APIs with use of stored-simple-schema and thus shrinking the package API surface area and making its types more interoperable.

getUnhydratedContext has been improved to give better asserts when uses re-entrantly.

Breaking Changes

Several stored and simple schema alpha APIs have been impacted, but all stable APIs should behave as is.

Reviewer Guidance

The review process is outlined on this wiki page.

@CraigMacomber CraigMacomber changed the title To stored refactor Simple-Schema and to stored refactor Dec 3, 2025
@CraigMacomber CraigMacomber marked this pull request as ready for review December 3, 2025 18:36
@CraigMacomber CraigMacomber requested review from a team as code owners December 3, 2025 18:36

import {
unreachableCase,
fail,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left "toStoredSchema.ts" named as is to make this diff more likely to work, but in the future, it should probably be unified with "toStored" and renamed.

};

export const permissiveStoredSchemaGenerationOptions: StoredSchemaGenerationOptions = {
export const permissiveStoredSchemaGenerationOptions: StoredFromViewSchemaGenerationOptions = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ever used outside of tests or verification? I think this needs a doc comment explaining the use case. I see where configuration.ts uses it, which seems to boil down to the following:

  1. Walk all allowed types excluding staged upgrades and check for duplicate identifiers
  2. Walk all allowed types including staged upgrades and check to see if there are duplicates

This makes me think permissiveStoredSchemaGenerationOptions is just used for diagnostics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used down on line 104 below for toUnhydratedSchema, which is very much used in production, and has documentation. The fact that toUnhydratedSchema uses this is really a subjective policy choice made when implementing unhydrated trees, so I don't think documenting that usage of this here really makes sense (Creating cyclic deps in our documentation everywhere we use things adds a lot of coupling to maintain)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed the toUnhydratedSchema case. So there are cases where we include staged allowed types in stored schemas? I agree it doesn't make sense to document every use case on the members, but it would be nice to include some rationale about when to use permissive vs. restrictive.

* @param allowedType - The allowed type to filter.
* @param options - The options to use for filtering.
* @returns Whether the allowed type passes the filter.
* Marker type indicating that the input schema is already a stored schema.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the name Unchanged. Currently this just tells transformSimpleSchema to preserve staged type information. How about PreserveViewSpecificData? Are there other kinds of behavior you want to turn on/off in Unchanged mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there might have been an old version of this PR where PreserveViewSpecificData was a thing, but no such thing currently exists. We do have preservesViewData, but it only exists to ensure that view specific data is discarded when generating a stored schema (which is required by the typing, so there is not a subjective choice going on during the transformation). I have added some docs to preservesViewData, maybe that helps?

* @alpha
*/
export function encodeSimpleSchema(simpleSchema: SimpleTreeSchema): JsonCompatibleReadOnly {
export function encodeSchemaCompatibilitySnapshot(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a conversation with @noencke about the names a while back. The thinking was that there is no canonical encode/decode for SimpleSchema, so this might as well be it (with documentation explaining restrictions around metadata/persistedMetadata). Is the rationale behind the change that the names would imply the presence of these fields, and that is confusing? Or do you think these methods will only be used for compatibility snapshots?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should have a lossy cannonical encoding that drops data which other encoding which we already have preserve.

If I store a simple schema by converting it to our stored schema persisted format, that preserves persisted metadata, while this does not. THis also does not perserve field and nod (non persisted) metadata. I do not think such a lossy encoding should be the cannonical encoding of something with no details in its name to indicate that its encoding a subset of the information for a specific purpose.

readonly metadata: SimpleNodeSchemaBase<TNodeKind, TCustomMetadata>["metadata"] &
(Type extends SchemaType.View
? unknown
: { readonly custom?: undefined; readonly description?: undefined });
Copy link
Contributor

@TommyBrosman TommyBrosman Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use NodeSchemaMetadata here instead of declaring a new type?

Copy link
Contributor Author

@CraigMacomber CraigMacomber Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the entire point of this logic is to force stored simple schema to have no metadata (undefined for these fields).

WHat we have here is conditional type logic where one branch of the condition is NodeSchemaMetadata (inherited from the base type) and the other branch is narrowing it to force undefined content: using NodeSchemaMetadata on both branches would not acomplish the differentiation between view and stored schema that this is doing.

import type { SchemaType, SimpleNodeSchema, SimpleTreeSchema } from "./simpleSchema.js";
import { walkFieldSchema } from "./walkFieldSchema.js";

export function createTreeSchema(rootSchema: ImplicitFieldSchema): TreeSchema {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the copying now all done by transformSimpleSchema?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not copy anything, so not sure why you are asking that here, but transformSimpleSchema does copy schema, so I think the answer is yes. transformSimpleSchema is not the only thing that copies schema, writing stored schema to a document copies schema, forking a branch copies stored schema, generateSchgemaFromSimplSchema copies schema, generating stored from simple schema copies schema. I think transformSimpleSchema is the only thing that deep copies simple schema into simple schema that remains though.

readonly metadata: FieldSchemaMetadata &
(Type extends SchemaType.View
? unknown
: { readonly custom?: undefined; readonly description?: undefined });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as the one on SimpleNodeSchemaBaseAlpha.metadata: why is there an inline type? Shouldn't FieldSchemaMetadata have these same fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same answer as the other case. FieldSchemaMetadata uncondiotnally permits non-undefined values which is not what we want here, and thus we need to override than with this to narrow the type.

* @remarks
* See also {@link generateSchemaFromSimpleSchema} for a different schema transformation.
* Note that if we want to make `generateSchemaFromSimpleSchema` consume view simple-schema, and use these transformation APIs to generate that view simple-schema from a stored simple-schema,
* we will need to add a "ToView" option here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These union members could use doc comments, or maybe just a doc comment on SimpleSchemaTransformationOptions that explains that we need these options in order to avoid copying view-specific fields in certain cases (and why). I don't think there's an existing doc comment that breaks down the behavior of the allowed types filtering logic/etc. and it is not obvious from the (internal) API surface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These options simply describe what conversions can be done. Implementation details of how transformSimpleNodeSchema outputs a valid SimpleTreeSchema<SchemaType.Stored> that does not assign non undefined values where the types require values to be undefined is not something this type should be concerned about.

Each of the types in this union does have a doc comment already.

Also StoredFromViewSchemaGenerationOptions.includeStaged documents that it filters which staged allowed types are included, and thats the only allowed type filtering that exists. I'm not sure what more could be provided.

Copilot AI review requested due to automatic review settings December 5, 2025 02:20
Copilot finished reviewing on behalf of CraigMacomber December 5, 2025 02:21
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the simple-schema system to better distinguish between view and stored schema types, improving type safety and API clarity. The changes introduce a SchemaType enum to explicitly type schemas as either View or Stored, rename compatibility snapshot APIs for better clarity, and refactor schema transformation logic to separate concerns between format conversion and semantic transformations.

Key changes:

  • Introduces SchemaType enum with Stored and View variants to parameterize schema interfaces
  • Renames encodeSimpleSchema/decodeSimpleSchema to encodeSchemaCompatibilitySnapshot/decodeSchemaCompatibilitySnapshot
  • Refactors schema transformation to convert view→stored simple-schema before converting to persisted format
  • Adds reentrancy protection to getUnhydratedContext for better debugging
  • Removes definitionsInternal from TreeViewConfiguration public API surface

Reviewed changes

Copilot reviewed 55 out of 55 changed files in this pull request and generated no comments.

Show a summary per file
File Description
packages/framework/fluid-framework/api-report/*.api.md Removes definitionsInternal from TreeViewConfiguration in all API reports
packages/dds/tree/src/simple-tree/simpleSchema.ts Adds SchemaType enum and parameterizes all simple schema interfaces with it
packages/dds/tree/src/simple-tree/toStoredSchema.ts Major refactoring to transform view→stored simple-schema, then to persisted format
packages/dds/tree/src/simple-tree/treeSchema.ts New file defining TreeSchema interface and createTreeSchema function
packages/dds/tree/src/simple-tree/createContext.ts Adds reentrancy guard to getUnhydratedContext
packages/dds/tree/src/simple-tree/api/*.ts Updates to use renamed snapshot APIs and new schema transformation functions
packages/dds/tree/src/simple-tree/core/*.ts Updates to support new SchemaType parameterization and transformation options
packages/dds/tree/src/simple-tree/node-kinds/**/*.ts Updates all node type interfaces to specify SchemaType.View
packages/dds/tree/src/test/**/*.spec.ts Test updates to use new APIs and validate schema transformations
packages/dds/tree/src/shared-tree/*.ts Updates to properly type stored schema exports as SchemaType.Stored

@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2025

🔗 No broken links found! ✅

Your attention to detail is admirable.

linkcheck output


> [email protected] ci:check-links /home/runner/work/FluidFramework/FluidFramework/docs
> start-server-and-test "npm run serve -- --no-open" 3000 check-links

1: starting server using command "npm run serve -- --no-open"
and when url "[ 'http://127.0.0.1:3000' ]" is responding with HTTP status code 200
running tests using command "npm run check-links"


> [email protected] serve
> docusaurus serve --no-open

[SUCCESS] Serving "build" directory at: http://localhost:3000/

> [email protected] check-links
> linkcheck http://localhost:3000 --skip-file skipped-urls.txt

Crawling...

Stats:
  243138 links
    1776 destination URLs
    2015 URLs ignored
       0 warnings
       0 errors


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants