Skip to content

Conversation

@SigureMo
Copy link

We are PaddlePaddle contributors working on a PyTorch compatibility layer aimed at making it significantly easier for PyTorch ecosystem libraries to run on Paddle.

Background

We recently explored a similar integration with FlashInfer (see flashinfer-ai/flashinfer#1642). While working on that PR, we learned that FlashInfer has adopted TVM FFI (https://github.com/apache/tvm-ffi) as a framework-agnostic binding solution, which better aligns with our long-term compatibility goals.

Purpose of This PR

We would like to bring similar PaddlePaddle support to FlashMLA and are opening this PR to:

  1. Gauge interest: Would the FlashMLA team be open to supporting PaddlePaddle through one of the following approaches?
  2. Explore integration options: We're flexible on the implementation path and would like to discuss two potential approaches with you:

Approach 1: Compatibility Layer (Similar to FlashInfer PR flashinfer-ai/flashinfer#1642)

  • C++ / CUDA layer: Provide an adapter that is fully compatible with the PyTorch C API surface (ATen / c10 / torch). This allows FlashMLA's C++/CUDA code to invoke Paddle's implementation via the adapter.
  • Python layer: Provide paddle.compat.enable_torch_proxy() which makes import torch actually load paddle, keeping changes non-invasive.
  • Opt-in mechanism: Controlled by the PADDLE_COMPATIBLE_API environment variable, ensuring default behavior remains unchanged.

Approach 2: TVM FFI Integration (Recommended)

We notice that FlashInfer has successfully migrated to TVM FFI (see flashinfer-ai/flashinfer#1641), which provides:

  • Framework-agnostic C ABI designed for kernels and DSLs
  • Zero-copy interop across frameworks using DLPack protocol
  • Multi-language support (Python, C++, Rust)
  • Ability to ship one wheel for multiple frameworks

If FlashMLA is interested in TVM FFI integration, we would be very happy to help with the migration work. This approach offers better long-term maintainability and broader ecosystem compatibility compared to framework-specific adapters.

Next Steps

We're opening this as a draft PR to start the conversation. Depending on FlashMLA team's preference, we can:

  • Proceed with the compatibility layer approach (this PR contains those changes)
  • Collaborate on TVM FFI integration instead
  • Explore alternative solutions that work better for FlashMLA's roadmap

We look forward to hearing your thoughts and are excited about the possibility of bringing FlashMLA support to the PaddlePaddle ecosystem!

Related Links

@SigureMo SigureMo marked this pull request as draft October 22, 2025 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant