NVLink vs PCIe
NCCL_P2P_LEVEL=NVL
tells NCCL to use peer-to-peer communication for GPU pairs that have NVLink connections. Essentially, it raises the
“cutoff” to NVLink level, ensuring NCCL doesn’t fall back to slower paths. If NVLink isn’t available, NCCL will automatically fall back
to PCIe. (Other possible values for NCCL_P2P_LEVEL include PIX, PXB, PHB, etc., which correspond to the topologies we described earlier –
but NVL is the highest speed option.)