Kandinsky5 lite dit clean by jaisurya27 · Pull Request #1088 · hao-ai-lab/FastVideo

jaisurya27 · 2026-02-10T03:38:12Z

Adds native FastVideo implementation of Kandinsky5Transformer3DModel (no Diffusers wrapper).
Adds Kandinsky5 Lite DiT config + parameter mapping and registry wiring.
Adds a local parity test against Diffusers checkpoint.

gemini-code-assist · 2026-02-10T03:38:33Z

Summary of Changes

Hello @jaisurya27, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances FastVideo's capabilities by integrating the Kandinsky5 Lite DiT model through a native implementation of its 3D Transformer. This change moves away from external Diffusers wrappers, providing a more streamlined and optimized architecture within the FastVideo ecosystem. The update includes all necessary configuration and registry entries, and its correctness is validated by a comprehensive parity test against the original Diffusers version.

Highlights

Native Kandinsky5Transformer3DModel Implementation: A native FastVideo implementation of the Kandinsky5Transformer3DModel has been added, eliminating the need for Diffusers wrappers and integrating directly into the FastVideo framework.
Kandinsky5 Lite DiT Configuration and Registry: New configuration classes (Kandinsky5ArchConfig, Kandinsky5VideoConfig) were introduced, along with parameter mapping and wiring into the model registry, enabling proper setup and discovery of the Kandinsky5 model.
Local Parity Test: A dedicated local parity test has been included to ensure that the native FastVideo Kandinsky5 implementation produces numerically identical results to the original Diffusers checkpoint.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

fastvideo/configs/models/dits/init.py
- Imported Kandinsky5VideoConfig.
- Added Kandinsky5VideoConfig to the all export list.
fastvideo/configs/models/dits/kandinsky5.py
- Defined Kandinsky5ArchConfig with specific parameters for the Kandinsky5 Transformer, including FSDP sharding conditions and Diffusers config fields.
- Defined Kandinsky5VideoConfig to encapsulate the architecture configuration.
fastvideo/models/dits/kandinsky5.py
- Implemented the Kandinsky5Transformer3DModel class, providing a native FastVideo version of the Kandinsky5 Transformer.
- Included helper functions for frequency generation, local patching/merging, and fractal flattening/unflattening.
- Defined various sub-modules: Kandinsky5TimeEmbeddings, Kandinsky5TextEmbeddings, Kandinsky5VisualEmbeddings, Kandinsky5RoPE1D, Kandinsky5RoPE3D, Kandinsky5Modulation, Kandinsky5Attention, Kandinsky5FeedForward, Kandinsky5OutLayer, Kandinsky5TransformerEncoderBlock, and Kandinsky5TransformerDecoderBlock.
- Added a materialize_non_persistent_buffers method to handle buffer initialization.
fastvideo/models/loader/fsdp_load.py
- Added a conditional call to model.materialize_non_persistent_buffers after loading an FSDP model, if the method exists.
fastvideo/models/registry.py
- Registered Kandinsky5Transformer3DModel in the _VIDEO_DIT_MODELS dictionary.
tests/local_tests/transformers/test_kandinsky5_lite_transformer_parity.py
- Added a new test file to verify the numerical parity between the native FastVideo Kandinsky5Transformer3DModel and the Diffusers Kandinsky5Transformer3DModel.
- Included setup for loading both models, generating random inputs, and asserting close results.

Activity

No human activity has been recorded on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a native FastVideo implementation for the Kandinsky 5 Lite DiT model, including its configuration, registry wiring, and a parity test against the Diffusers implementation. The overall structure is good, but I've identified a critical issue in the model's class attribute initialization that would cause an AttributeError. Additionally, there are a few other points for improvement regarding hardcoded data types, unused function parameters, and magic numbers that would enhance the code's robustness and maintainability.

gemini-code-assist · 2026-02-10T03:40:44Z