Skip to content

Commit 66d9566

Browse files
author
Quentin Anthony
committed
Add distributed example without multiprocessing to README
1 parent d5fc6b5 commit 66d9566

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

imagenet/README.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,15 @@ Node 1:
4545
python main.py -a resnet50 --dist-url 'tcp://IP_OF_NODE0:FREEPORT' --dist-backend 'nccl' --multiprocessing-distributed --world-size 2 --rank 1 [imagenet-folder with train and val folders]
4646
```
4747

48+
## Distributed Data Parallel Training
49+
50+
If you wish to disable PyTorch's multiprocessing module and manually manage processes yourself (e.g. with MPI), you must specify the `world-size`, `rank` and `gpu` values yourself. For example:
51+
52+
```bash
53+
python main.py ... --world-size 2 --rank 0 --gpu 0 [imagenet-folder with train and val folders] &
54+
python main.py ... --world-size 2 --rank 1 --gpu 1 [imagenet-folder with train and val folders] &
55+
```
56+
4857
## Usage
4958

5059
```

0 commit comments

Comments
 (0)