Skip to content

A WIP distributed VAE and transformer model using cuda C kernels and MPI. Goal is to perform reasonable distributed training and inference on modest infrastructure and to have low dependencies

Notifications You must be signed in to change notification settings

shawnschulz/loki_distributed

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Loki

This is a heavy work in progress, but the idea is to setup up docker containers on multiple hosts with CUDA + MPI, then use either docker swarm or MPI and trad networking to run a distribtued transformer or VAE model. Still working on the cuda kernels for the VAE model and transformers, but the makefile should work for the given docker image.

About

A WIP distributed VAE and transformer model using cuda C kernels and MPI. Goal is to perform reasonable distributed training and inference on modest infrastructure and to have low dependencies

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published