PointTransformerV3¶
PointTransformerV3 is a LiDAR backbone for LiDAR 3D semantic segmentation and
3D object detection. For detection, PTv3 point features are projected into BEV
and consumed by either CenterHead or TransFusionHead.
The training path uses flash-attn for serialized attention. The export path
automatically disables flash attention and falls back to the standard attention
implementation so ONNX export remains supported.
Summary¶
| Property | Value |
|---|---|
| Task | 3D semantic segmentation, 3D object detection |
| Modality | LiDAR |
| Input | Point cloud |
| Output | Point-wise semantic labels or 3D boxes/scores/classes |
| Architecture | PointTransformerV3 with sparse convolution stem and task heads |
| Datasets | NuScenes, T4Dataset |
Available Configurations¶
| Config Name | Task | Dataset | Head | Range | Purpose |
|---|---|---|---|---|---|
segmentation3d/ptv3/voxel005_102m_nuscenes |
segmentation3d | NuScenes | Linear | 102 m | Standard NuScenes segmentation |
segmentation3d/ptv3/voxel005_128m_t4dataset_j6gen2 |
segmentation3d | T4Dataset | Linear | 128 m | Standard T4Dataset segmentation |
segmentation3d/ptv3/voxel012_122m_t4dataset_j6gen2 |
segmentation3d | T4Dataset | Linear | 122 m | Lightweight T4Dataset segmentation |
detection3d/ptv3/centerhead_voxel005_128m_t4dataset_j6gen2 |
detection3d | T4Dataset | CenterHead | 128 m | Dense CenterPoint-style detection |
detection3d/ptv3/centerhead_voxel012_122m_t4dataset_j6gen2 |
detection3d | T4Dataset | CenterHead | 122 m | Tiny-backbone detection |
detection3d/ptv3/transhead_voxel005_128m_t4dataset_j6gen2 |
detection3d | T4Dataset | TransFusion | 128 m | Query-based detection |
detection3d/ptv3/transhead_voxel012_122m_t4dataset_j6gen2 |
detection3d | T4Dataset | TransFusion | 122 m | Tiny query-based detection |
detection3d/ptv3/transhead_voxel012_122m_t4dataset_j6gen2_optuna |
detection3d | T4Dataset | TransFusion | 122 m | TransFusion hyperparameter search |
Training¶
autoware-ml train --config-name segmentation3d/ptv3/voxel005_102m_nuscenes
autoware-ml train --config-name segmentation3d/ptv3/voxel005_128m_t4dataset_j6gen2
autoware-ml train --config-name detection3d/ptv3/centerhead_voxel005_128m_t4dataset_j6gen2
autoware-ml train --config-name detection3d/ptv3/transhead_voxel012_122m_t4dataset_j6gen2
For a pipeline validation run:
autoware-ml train \
--config-name segmentation3d/ptv3/voxel005_102m_nuscenes \
+trainer.fast_dev_run=true
Evaluation¶
autoware-ml test \
--config-name segmentation3d/ptv3/voxel005_102m_nuscenes \
--weights mlruns/segmentation3d/ptv3/voxel005_102m_nuscenes/<run_id>/artifacts/checkpoints/best.ckpt
autoware-ml test \
--config-name detection3d/ptv3/transhead_voxel012_122m_t4dataset_j6gen2 \
--weights mlruns/detection3d/ptv3/transhead_voxel012_122m_t4dataset_j6gen2/<run_id>/artifacts/checkpoints/best.ckpt
Deployment¶
PointTransformerV3 ONNX export is available. The generic TensorRT stage remains disabled in Autoware-ML because PTv3 requires a runtime with matching sparse convolution plugins.
autoware-ml deploy \
--config-name segmentation3d/ptv3/voxel005_102m_nuscenes \
--weights mlruns/segmentation3d/ptv3/voxel005_102m_nuscenes/<run_id>/artifacts/checkpoints/best.ckpt \
deploy.tensorrt.enabled=false
autoware-ml deploy \
--config-name detection3d/ptv3/centerhead_voxel005_128m_t4dataset_j6gen2 \
--weights mlruns/detection3d/ptv3/centerhead_voxel005_128m_t4dataset_j6gen2/<run_id>/artifacts/checkpoints/best.ckpt \
deploy.tensorrt.enabled=false
The deployment command switches PTv3 attention blocks into non-flash export mode automatically.
The exported ONNX model returns both pred_labels and pred_probs. The
probability output is produced by a final softmax layer, while training and
evaluation continue to use logits inside the Lightning model.
Deployment uses an explicit PTv3 export wrapper and a copied backbone, so the training model is not mutated when export-only sparse-convolution and serialization settings are applied.
Implementation¶
| Path | Description |
|---|---|
autoware_ml/models/segmentation3d/ptv3.py |
PTv3 Lightning model wrapper |
autoware_ml/models/detection3d/ptv3.py |
PTv3 BEV detection model wrapper |
autoware_ml/models/detection3d/heads/centerpoint.py |
CenterHead detection head |
autoware_ml/models/detection3d/heads/transfusion.py |
TransFusion detection head |
autoware_ml/models/segmentation3d/backbones/ptv3.py |
Reusable PTv3 backbone components |
autoware_ml/utils/point_cloud/ |
Shared point-cloud utilities and serialization |
autoware_ml/ops/segment/segment_csr.py |
Segment reduction export operator |
autoware_ml/losses/segmentation3d/ |
Segmentation losses used by PTv3 |
autoware_ml/datamodule/nuscenes/segmentation3d.py |
NuScenes datamodule |
autoware_ml/datamodule/t4dataset/segmentation3d.py |
T4Dataset datamodule |
autoware_ml/datamodule/t4dataset/detection3d.py |
T4Dataset 3D detection datamodule |
autoware_ml/transforms/point_cloud/ |
Shared point-cloud transforms used by PTv3 |
autoware_ml/configs/tasks/segmentation3d/ptv3/ |
Task configurations |
autoware_ml/configs/tasks/detection3d/ptv3/ |
Detection task configurations |
Acknowledgment¶
The Autoware-ML PointTransformerV3 implementation was ported from the official PointTransformerV3 project by Pointcept.
- Repository: https://github.com/Pointcept/PointTransformerV3
- License: MIT
- Paper: Wu, Xiaoyang, et al. "Point Transformer V3: Simpler, Faster, Stronger." CVPR, 2024.