I'm sorry to trouble you again, but could you explain how to implement multi-GPU training with this code? I noticed that the training speed with 8 GPUs seems to be the same as when using just one GPU. Also, is there an implementation available that utilizes PyTorch's DistributedDataParallel?
I'm sorry to trouble you again, but could you explain how to implement multi-GPU training with this code? I noticed that the training speed with 8 GPUs seems to be the same as when using just one GPU. Also, is there an implementation available that utilizes PyTorch's DistributedDataParallel?