Weichen Xu, Tianhao Fu, Jian Cao, Xinyu Zhao, Xinxin Xu, Xixin Cao, Xing ZhangΒ Β Β
Peking University, Beijing 100871, ChinaΒ Β Β
In pcdet/models/dense_heads/pretrain_head_3D_seal.py, we provide the implementations of High-level Voxel Feature Generation Module
, which involves the processes of data extraction, voxelization, as well as the computation of the target and loss.
In pcdet/models/backbones_3d/I2Mask.py and pcdet/models/backbones_3d/dsvt_backbone_mae.py, we provide the implementations of Inter-class and Intra-class Discrimination-guided Masking
(I$^2$Mask).
In pcdet/utils/cka_alhpa.pyand pcdet/models/dense_heads/pretrain_head_3D_seal.py L172, we provide the implementations of CKA-guided Hierarchical Reconstruction
.
In pcdet/models/dense_heads/pretrain_head_3D_seal.py L167 def differential_gated_progressive_learning, we provide the implementations of Differential-gated Progressive Learning
.
We provide all the configuration files in the paper and appendix in tools/cfgs/gpicture_models/.
π BACK to Table of Contents -->
Pre-training
Waymo
Model | Pre-train Fraction | Pre-train model | Log |
---|---|---|---|
GPICTURE (DSVT) | 20% | ckpt | Log |
GPICTURE (DSVT) | 100% | ckpt | Log |
nuScenes
Model | Pre-train model | Log |
---|---|---|
GPICTURE (DSVT) | ckpt | Log |
SemanticKITTI
Model | Pre-train model | Log |
---|---|---|
GPICTURE (DSVT) | ckpt | Log |
Fine-tuning
3D Object Detection (on Waymo validation)
Model | Pre-train Fraction | mAP/H_L2 | Veh_L2 | Ped_L2 | Cyc_L2 | ckpt | Log |
---|---|---|---|---|---|---|---|
DSVT (GPICTURE) | 20% | 73.84/71.75 | 71.55/71.22 | 75.99/70.61 | 73.98/73.42 | ckpt | Log |
DSVT (GPICTURE) | 100% | 75.55/73.13 | 73.38/72.87 | 77.52/72.01 | 75.75/74.51 | ckpt | Log |
3D Object Detection (on nuScenes validation)
Model | mAP | NDS | mATE | mASE | mAOE | mAVE | mAAE | ckpt | Log |
---|---|---|---|---|---|---|---|---|---|
DSVT (GPICTURE) | 68.6 | 73.0 | 25.5 | 23.8 | 25.8 | 20.7 | 17.4 | ckpt | Log |
3D Semantic Segmentation (on nuScenes validation)
Model | mIoU | Bicycle | Bus | Car | Motorcycle | Pedestrian | Trailer | Truck | ckpt | Log |
---|---|---|---|---|---|---|---|---|---|---|
Cylinder3D-SST (GPICTURE) | 79.7 | 43.6 | 94.8 | 96.5 | 81.0 | 84.4 | 65.8 | 87.7 | ckpt | Log |
3D Semantic Segmentation (on SemanticKITTI validation)
Model | mIoU | Car | Bicycle | Motorcycle | Truck | Other-vehicle | Person | ckpt | Log |
---|---|---|---|---|---|---|---|---|---|
Cylinder3D-SST (GPICTURE) | 64.7 | 96.4 | 50.3 | 69.8 | 84.8 | 50.9 | 72.4 | ckpt | Log |
Occupancy Prediction (on nuScenes OpenOccupancy validation)
Model | mIoU | Bicycle | Bus | Car | Motorcycle | Pedestrian | Trailer | Truck | ckpt | Log |
---|---|---|---|---|---|---|---|---|---|---|
DSVT (GPICTURE) | 18.8 | 8.4 | 16.2 | 21.1 | 7.9 | 12.8 | 15.9 | 16.3 | ckpt | Log |
π BACK to Table of Contents -->
After downloading, please put it into project path
π BACK to Table of Contents -->
Waymo:
1.Download the Waymo dataset from the official Waymo website, and make sure to download version 1.2.0 of Perception Dataset.
2.Prepare the directory as follows:
GPICTURE
βββ data
β βββ waymo
β β βββ ImageSets
β β βββ raw_data
β β β βββ segment-xxxxxxxx.tfrecord
| | | |ββ ...
βββ pcdet
βββ tools
3.Prepare the environment and install waymo-open-dataset
:
pip install waymo-open-dataset-tf-2-5-0
4.Generate the complete dataset. It require approximately 1T disk and 100G RAM.
# only for single-frame setting
python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos \
--cfg_file tools/cfgs/dataset_configs/waymo_dataset.yaml
# for single-frame or multi-frame setting
python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos --cfg_file tools/cfgs/dataset_configs/waymo_dataset_multiframe.yaml
# Ignore 'CUDA_ERROR_NO_DEVICE' error as this process does not require GPU.
nuScenes:
1.Prepare the trainval
dataset from nuScenes and prepare the directory as follows:
GPICTURE
βββ data
β βββ nuscenes
β β βββ v1.0-trainval
β β β βββ samples
β β β βββ sweeps
β β β βββ maps
β β β βββ v1.0-trainval
βββ pcdet
βββ tools
2.Prepare the environment and install nuscenes-devkit
οΌ
pip install nuscenes-devkit==1.0.5
3.Generate the complete dataset.
# for lidar-only setting
python -m pcdet.datasets.nuscenes.nuscenes_dataset --func create_nuscenes_infos --cfg_file tools/cfgs/dataset_configs/nuscenes_dataset.yaml --version v1.0-trainval
nuScenes Lidarseg:
1.Download the annotation files from nuScenes and prepare the directory as follows:
GPICTURE
βββ data
β βββ nuscenes
β β βββ v1.0-trainval
β β β βββ samples
β β β βββ sweeps
β β β βββ maps
β β β βββ v1.0-trainval
β β β β βββ lidarseg.json
β β β β βββ category.json
β β βββ lidarseg
β β β βββ v1.0-trainval
βββ pcdet
βββ tools
nuScenes OpenOccupancy:
1.Download the annotation files from OpenOccupancy and prepare the directory as follows:
GPICTURE
βββ data
β βββ nuscenes
β β βββ v1.0-trainval
β β β βββ samples
β β β βββ sweeps
β β β βββ maps
β β β βββ v1.0-trainval
β β β β βββ lidarseg.json
β β β β βββ category.json
β βββ nuScenes-Occupancy
βββ pcdet
βββ tools
2.Prepare the environment:
conda install -c omgarcia gcc-6 # gcc-6.2
pip install mmcv-full==1.4.0
pip install mmdet==2.14.0
pip install mmsegmentation==0.14.1
# Install mmdet3d from source code.
git clone https://github.com/open-mmlab/mmdetection3d.git
cd mmdetection3d
git checkout v0.17.1 # Other versions may not be compatible.
python setup.py install
# Install occupancy pooling.
git clone https://github.com/JeffWang987/OpenOccupancy.git
cd OpenOccupancy
export PYTHONPATH=β.β
python setup.py develop
SemanticKITTI:
1.Prepare the SemanticKITTI
dataset from SemanticKITTI and prepare the directory as follows:
GPICTURE
βββ data
β βββ semantickitti
β β βββ sequences
β β β βββ 00
β β β β βββ labels
β β β β βββ velodyne
β β β βββ 01
β β β βββ ..
β β β βββ 22
βββ pcdet
βββ tools
2.Generate the complete dataset.
python -m pcdet.datasets.kitti.semantickitti_dataset --func create_semantickitti_infos --cfg_file tools/cfgs/dataset_configs/semantickitti_dataset.yaml
The folder structure after processing should be as below
GPICTURE
βββ data
β βββ semantickitti
β β βββ sequences
β β β βββ 00
β β β βΒ Β βββ labels
β β β βΒ Β βββ velodyne
β β β βββ 01
β β β βββ ..
β β β βββ 22
β β βββ semantickitti_infos_train.pkl
β β βββ semantickitti_infos_val.pkl
βββ pcdet
βββ tools
π BACK to Table of Contents -->
- create environment and install pytorch
conda create --name gpicture python=3.8
conda activate gpicture
# install pytorch
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.1 -c pytorch -c conda-forge
# Verify if pytorch is installed
import torch
print(torch.cuda.is_available()) # If normal, return "True"
import torch # If normal, remain silent
a = torch.Tensor([1.]) # If normal, remain silent
a.cuda() # If normal, return"tensor([ 1.], device='cuda:0')"
from torch.backends import cudnn # If normal, remain silent
cudnn.is_acceptable(a.cuda()) # If normal, return "True"
2.install OpenPCDet
# install spconv
pip install spconv-cu111
# install requirements
pip install -r requirements.txt
# setup
python setup.py develop
3.install other packages
# install other packages
pip install torch_scatter
pip install nuscenes-devkit==1.0.5
pip install open3d
# install the Python package for evaluating the Waymo dataset
pip install waymo-open-dataset-tf-2-5-0==1.4.1
# pay attention to specific package versions.
pip install pandas==1.4.3
pip install matplotlib==3.6.2
pip install scikit-image==0.19.3
pip install async-lru==1.0.3
# install CUDA extensions
cd common_ops
pip install .
4.install MinkowskiEngine
# install MinkowskiEngine
pip install ninja
conda install openblas-devel -c anaconda
git clone https://github.com/NVIDIA/MinkowskiEngine.git
cd MinkowskiEngine
python setup.py install --blas_include_dirs=${CONDA_PREFIX}/include --blas=openblas
π BACK to Table of Contents -->
1.Prepare the coords
and feats
inputs.
cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_waymo_ssl_seal_generate_input.yaml
# or
python train.py --cfg_file cfgs/gpicture_models/gpicture_nuscenes_ssl_seal_generate_input.yaml
2.Utilize the MinkUNet (Res16UNet34C) pre-trained by Seal to generate the Seal features.
cd tools/
python prepare_seal_output.py
π BACK to Table of Contents -->
We provide the configuration files in the paper and appendix in tools/cfgs/gpicture_models/.
Pre-training
Waymo
cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_waymo_ssl_seal_decoder_mask.yaml
nuScenes
cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_nuscenes_ssl_seal_decoder_mask.yaml
SemanticKITTI
cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_semantickitti_ssl_seal_decoder_mask.yaml
Fine-tuning
3D Object Detection on Waymo:
cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_waymo_detection.yaml --pretrained_model /path/of/pretrain/model.pth
3D Object Detection on nuScenes:
cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_nuscenes_detection.yaml --pretrained_model /path/of/pretrain/model.pth
3D Semantic Segmentation:
cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_nuscenes_segmentation.yaml --pretrained_model /path/of/pretrain/model.pth
cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_semantickitti_segmentation.yaml --pretrained_model /path/of/pretrain/model.pth
Occupancy Prediction:
cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_nuscenes_occupancy.yaml --pretrained_model /path/of/pretrain/model.pth