-
Notifications
You must be signed in to change notification settings - Fork 598
Text to Video Reference Implementation #2413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
0718a2f to
4401a5a
Compare
nvzhihanj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Harshil
|
|
||
| - Model: [Wan-AI/Wan2.2-T2V-A14B-Diffusers](https://huggingface.co/Wan-AI/Wan2.2-T2V-A14B-Diffusers) | ||
| - VBench: [GitHub](https://github.com/Vchitect/VBench) | ||
| - MLPerf: [Inference](https://github.com/mlcommons/inference) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing:
- Reference accuracy score section for VBench at original accuracy (BF16)
- Supported hardware (I suppose this is Hopper and pre-hopper based on the torch/cuda version)?
- mlperf.conf change for performance sample count and related settings
- submission checker changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Submission checker can go into a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1, 2 done. 3 and 4 can likely be clubbed with Loadgen changes Pablo is helping us with.
| @@ -0,0 +1,46 @@ | |||
| FROM pytorch/pytorch:2.5.1-cuda12.1-cudnn9-devel | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this is pre-Blackwell, recommend adding the supported hardware in the README
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted in readme.
|
|
||
| try: | ||
| print("Starting download...") | ||
| snapshot_download( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we ping to a certain commit of the model? (or at least note it in the readme)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted commit id in the Readme. There isn't much activity in huggingface model repo, so we should be fine.
| @@ -0,0 +1,39 @@ | |||
| # Text-to-Video Benchmark | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the folder name, it should be text_to_video/Wan2.2-T2V-A14B so we can add new models in the future. (We didn't catch it in the text_to_image)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to model specific folder.
Checkout README.md for instructions.