Xiaohui Zeng
Arash Vahdat
Francis Williams
Zan Gojcic
Or Litany
Sanja Fidler
Karsten Kreis
Paper Project Page
Paper Project Page
- add pointclouds rendering code used for paper figure, see
utils/render_mitsuba_pc.py - When opening an issue, please add @ZENGXH so that I can reponse faster!
-
Dependencies:
- CUDA 11.6
-
Setup the environment Install from conda file
conda env create --name lion_env --file=env.yaml conda activate lion_env # Install some other packages pip install git+https://github.com/openai/CLIP.git # build some packages first (optional) python build_pkg.pyTested with conda version 22.9.0
-
Using Docker
- build the docker with
bash ./docker/build_docker.sh - launch the docker with
bash ./docker/run.sh
- build the docker with
run python demo.py, will load the released text2shape model on hugging face and generate a chair point cloud. (Note: the checkpoint is not released yet, the files loaded in the demo.py file is not available at this point)
- checkpoint can be downloaded from here
- after download, run the checksum with
python ./script/check_sum.py ./lion_ckpt.zip - put the downloaded file under
./lion_ckpt/
- ShapeNet can be downloaded here.
- Put the downloaded data as
./data/ShapeNetCore.v2.PC15kor edit thepointflowentry in./datasets/data_path.pyfor the ShapeNet dataset path.
- run
bash ./script/train_vae.sh $NGPU(the released checkpoint is trained withNGPU=4on A100) - if want to use comet to log the experiment, add
.comet_apifile under the current folder, write the api key as{"api_key": "${COMET_API_KEY}"}in the.comet_apifile
- require the vae checkpoint
- run
bash ./script/train_prior.sh $NGPU(the released checkpoint is trained withNGPU=8with 2 node on V100)
- this script trains model for single-view-reconstruction or text2shape task
- the idea is that we take the encoder and decoder trained on the data as usual (without conditioning input), and when training the diffusion prior, we feed the clip image embedding as conditioning input: the shape-latent prior model will take the clip embedding through AdaGN layer.
- require the vae checkpoint trained above
- require the rendered ShapeNet data, you can render yourself or download it from here
- put the rendered data as
./data/shapenet_render/or edit theclip_forge_imageentry in./datasets/data_path.py - the img data will be read under
./datasets/pointflow_datasets.pywith therender_img_path, you may need to cutomize this variable depending of the folder structure
- put the rendered data as
- run
bash ./script/train_prior_clip.sh $NGPU
- (tested) use comet-ml: need to add a file
.comet_apiunder thisLIONfolder, example of the.comet_apifile:
{"api_key": "...", "project_name": "lion", "workspace": "..."}
- (not tested) use wandb: need to add a
.wandb_apifile, and set the env variableexport USE_WB=1before training
{"project": "...", "entity": "..."}
- (not tested) use tensorboard, set the env variable
export USE_TFB=1before training - see the
utils/utils.pyfiles for the details of the experiment logger; I usually use comet-ml for my experiments
- download the test data (Table 1) from here, unzip and put it as
./datasets/test_data/ - download the released checkpoint from above
checkpoint="./lion_ckpt/unconditional/airplane/checkpoints/model.pt"
bash ./script/eval.sh $checkpoint # will take 1-2 hour
- ShapeNet-Vol test data:
- please check here before using this data
- all category: 1000 shapes are sampled from the full validation set
- chair, airplane, car
- table 21 and table 20, point-flow test data
- download the test data from here, unzip and put it as
./datasets/test_data/ - run
python ./script/compute_score.py(Note: for ShapeNet-Vol data and table 21, 20, need to setnorm_box=True)
@inproceedings{zeng2022lion,
title={LION: Latent Point Diffusion Models for 3D Shape Generation},
author={Xiaohui Zeng and Arash Vahdat and Francis Williams and Zan Gojcic and Or Litany and Sanja Fidler and Karsten Kreis},
booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
year={2022}
}
