This is a heavy work in progress, but the idea is to setup up docker containers on multiple hosts with CUDA + MPI, then use either docker swarm or MPI and trad networking to run a distribtued transformer or VAE model. Still working on the cuda kernels for the VAE model and transformers, but the makefile should work for the given docker image.
-
Notifications
You must be signed in to change notification settings - Fork 0
A WIP distributed VAE and transformer model using cuda C kernels and MPI. Goal is to perform reasonable distributed training and inference on modest infrastructure and to have low dependencies
shawnschulz/loki_distributed
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
About
A WIP distributed VAE and transformer model using cuda C kernels and MPI. Goal is to perform reasonable distributed training and inference on modest infrastructure and to have low dependencies
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published