3D Object Detection for Autonomous Driving in PyTorch, trained on the KITTI dataset.
3D Object Detection for Autonomous Driving in PyTorch, trained on the KITTI dataset.
Master of Science Thesis in Electrical Engineering, Linköping University, 2018
To train models and to run pretrained models, you can use an Ubuntu 16.04 P4000 VM with 250 GB SSD on Paperspace. Below I have listed what I needed to do in order to get started, and some things I found useful.
Install docker-ce:
Install CUDA drivers:
Install nvidia-docker:
Download the PyTorch 0.4 docker image:
Create start_docker_image.sh containing:
#!/bin/bash
# DEFAULT VALUES
GPUIDS="0"
NAME="paperspace_GPU"
NV_GPU="$GPUIDS" nvidia-docker run -it --rm \
-p 5584:5584 \
--name "$NAME""$GPUIDS" \
-v /home/paperspace:/root/ \
pytorch/pytorch:0.4_cuda9_cudnn7 bash
Inside the image, /root/ will now be mapped to /home/paperspace (i.e., $ cd -- takes you to the regular home folder).
To start the image:
To commit changes to the image:
To stop the image when it’s running:
To exit the image without killing running code:
To get back into a running image:
To open more than one terminal window at the same time:
To install the needed software inside the docker image:
Do the following outside of the docker image:
KITTI train:
KITTI val:
KITTI train random:
KITTI test:
pretrained_models/model_37_2_epoch_400.pth:
pretrained_models/model_32_2_epoch_299.pth:
pretrained_models/model_38_2_epoch_400.pth:
pretrained_models/model_10_2_epoch_400.pth:
SSH into the paperspace server.
$ sudo sh start_docker_image.sh
$ cd --
$ python 3DOD_thesis/Frustum-PointNet/eval_frustum_pointnet_val.py
validation loss: 0.667806
validation TNet loss: 0.0494426
validation InstanceSeg loss: 0.193783
validation BboxNet loss: 0.163053
validation BboxNet size loss: 0.0157994
validation BboxNet center loss: 0.0187426
validation BboxNet heading class loss: 0.096926
validation BboxNet heading regr loss: 0.00315847
validation heading class accuracy: 0.959445
validation corner loss: 0.0261527
validation accuracy: 0.921544
validation precision: 0.887209
validation recall: 0.949744
validation f1: 0.917124
SSH into the paperspace server.
$ sudo sh start_docker_image.sh
$ cd --
$ python 3DOD_thesis/Frustum-PointNet/eval_frustum_pointnet_val_seq.py
validation loss: 0.781812
validation TNet loss: 0.0352736
validation InstanceSeg loss: 0.292994
validation BboxNet loss: 0.158156
validation BboxNet size loss: 0.0182432
validation BboxNet center loss: 0.0204534
validation BboxNet heading class loss: 0.0838291
validation BboxNet heading regr loss: 0.00356304
validation heading class accuracy: 0.9675
validation corner loss: 0.0295388
validation accuracy: 0.865405
validation precision: 0.83858
validation recall: 0.924499
validation f1: 0.879015
SSH into the paperspace server.
$ sudo sh start_docker_image.sh
$ cd --
$ python 3DOD_thesis/Frustum-PointNet/eval_frustum_pointnet_test.py
SSH into the paperspace server.
$ sudo sh start_docker_image.sh
$ cd --
$ python 3DOD_thesis/Frustum-PointNet/eval_frustum_pointnet_test_seq.py
SSH into the paperspace server.
$ sudo sh start_docker_image.sh
$ cd --
$ python 3DOD_thesis/Frustum-PointNet/eval_frustum_pointnet_val_2ddetections.py
SSH into the paperspace server.
$ sudo sh start_docker_image.sh
$ cd --
$ python 3DOD_thesis/Extended-Frustum-PointNet/eval_frustum_pointnet_img_val.py
validation loss: 0.418462
validation TNet loss: 0.047026
validation InstanceSeg loss: 0.181566
validation BboxNet loss: 0.0217167
validation BboxNet size loss: 0.0020278
validation BboxNet center loss: 0.0168909
validation BboxNet heading class loss: 0.00148923
validation BboxNet heading regr loss: 0.000130879
validation heading class accuracy: 0.999694
validation corner loss: 0.0168153
validation accuracy: 0.927203
validation precision: 0.893525
validation recall: 0.954732
validation f1: 0.921978
SSH into the paperspace server.
$ sudo sh start_docker_image.sh
$ cd --
$ python 3DOD_thesis/Extended-Frustum-PointNet/eval_frustum_pointnet_img_val_seq.py
validation loss: 0.499888
validation TNet loss: 0.0294649
validation InstanceSeg loss: 0.281868
validation BboxNet loss: 0.0197038
validation BboxNet size loss: 0.00138443
validation BboxNet center loss: 0.0167136
validation BboxNet heading class loss: 4.17427e-05
validation BboxNet heading regr loss: 0.000156402
validation heading class accuracy: 0.998711
validation corner loss: 0.0168851
validation accuracy: 0.878334
validation precision: 0.848052
validation recall: 0.942269
validation f1: 0.8914
SSH into the paperspace server.
$ sudo sh start_docker_image.sh
$ cd --
$ python 3DOD_thesis/Extended-Frustum-PointNet/eval_frustum_pointnet_img_test.py
SSH into the paperspace server.
$ sudo sh start_docker_image.sh
$ cd --
$ python 3DOD_thesis/Extended-Frustum-PointNet/eval_frustum_pointnet_img_test_seq.py
SSH into the paperspace server.
$ sudo sh start_docker_image.sh
$ cd --
$ python 3DOD_thesis/Extended-Frustum-PointNet/eval_frustum_pointnet_img_val_2ddetections.py
SSH into the paperspace server.
$ sudo sh start_docker_image.sh
$ cd --
$ python 3DOD_thesis/Image-Only/eval_imgnet_val.py
val loss: 0.00425181
val size loss: 0.000454653
val keypoints loss: 0.000264362
val distance loss: 0.115353
val 3d size loss: 0.000439736
val 3d center loss: 0.0352361
val 3d r_y loss: 0.0983654
val 3d distance loss: 0.102937
SSH into the paperspace server.
$ sudo sh start_docker_image.sh
$ cd --
$ python 3DOD_thesis/Image-Only/eval_imgnet_val_seq.py
val loss: 0.00529856
val size loss: 0.000539969
val keypoints loss: 0.000351892
val distance loss: 0.123967
val 3d size loss: 0.000526106
val 3d center loss: 0.0398309
val 3d r_y loss: 0.000271052
val 3d distance loss: 0.11471
SSH into the paperspace server.
$ sudo sh start_docker_image.sh
$ cd --
$ python 3DOD_thesis/Image-Only/eval_imgnet_test.py
SSH into the paperspace server.
$ sudo sh start_docker_image.sh
$ cd --
$ python 3DOD_thesis/Image-Only/eval_imgnet_test_seq.py
SSH into the paperspace server.
$ sudo sh start_docker_image.sh
$ cd --
$ python 3DOD_thesis/Image-Only/eval_imgnet_val_2ddetections.py
For visualization of point clouds and 3Dbboxes in different ways, I have used Open3D on my Ubuntu 16.04 laptop.
On my laptop, the 3DOD_thesis folder is located at /home/fregu856/3DOD_thesis, which is reflected in the code.
Installing Open3D:
Basic Open3D usage:
Run a pretrained Frustum-PointNet, Extended-Frustum-PointNet or Image-Only model on KITTI val.
Place the created eval_dict_val.pkl in the correct location (see line 253 in visualize_eval_val.py).
$ cd 3DOD_thesis
$ python visualization/visualize_eval_val.py
This will:
Run a pretrained Frustum-PointNet, Extended-Frustum-PointNet or Image-Only model on KITTI test.
Place the created eval_dict_test.pkl in the correct location (see line 279 in visualize_eval_test.py).
$ cd 3DOD_thesis
$ python visualization/visualize_eval_test.py
This will:
Run a pretrained Frustum-PointNet, Extended-Frustum-PointNet or Image-Only model on a sequence from the KITTI training set.
Place the created eval_dict_val_seq_{sequence number}.pkl in the correct location (see line 256 in visualize_eval_val_seq.py).
$ cd 3DOD_thesis
$ python visualization/visualize_eval_val_seq.py
This will create a visualization video of some kind, the type of visualization is specified in the code (see the out-commented sections), but as default this will create a video visualizing both the ground truth and predicted 3Dbboxes in both the point clouds and images. Youtube video (yellow/red bboxes: predicted, pink/blue: ground truth).
Run a pretrained Frustum-PointNet, Extended-Frustum-PointNet or Image-Only model on sequences from KITTI test.
Place the created eval_dict_test_seq_{sequence number}.pkl files in the correct location (see line 282 in visualize_eval_test_seq.py).
$ cd 3DOD_thesis
$ python visualization/visualize_eval_test_seq.py
This will create visualization videos of some kind, the type of visualization is specified in the code (see the out-commented sections), but as default this will create a video visualizing the predicted 3Dbboxes in both the point clouds and images, and visualizing the input 2Dbboxes in the images. See Youtube video from the top of the page.
Very similar to visualize_eval_val.py, but also visualizes the results of the intermediate steps in the Frustum-PointNet/Extended-Frustum-PointNet architecture.
Run a pretrained Frustum-PointNet or Extended-Frustum-PointNet model on KITTI val and save intermediate results for visualization (uncomment the lines at line 146 in eval_frustum_pointnet_val.py, or line 148 in eval_frustum_pointnet_img_val.py).
Place the created eval_dict_val.pkl in the correct location (see line 278 in visualize_eval_val_extra.py).
$ cd 3DOD_thesis
$ python visualization/visualize_eval_val_extra.py
This will:
When all the vehicles in the current example have been visualized, it continues with the next example.
Simple script for visualizing all the point clouds you have located at 3DOD_thesis/data/kitti/object/training/velodyne.
$ cd 3DOD_thesis
$ python visualization/visualize_lidar.py
This will:
For computing evaluation metrics, I have used a slightly modified version of eval_kitti on my Ubuntu 16.04 laptop.
On my laptop, the 3DOD_thesis folder is located at /home/fregu856/3DOD_thesis, which is reflected in the code.
Run a pretrained Frustum-PointNet, Extended-Frustum-PointNet or Image-Only model on KITTI val, taking ground truth 2Dbboxes as input.
Place the created eval_dict_val.pkl in the correct location (see line 78 in create_txt_files_val.py).
$ cd 3DOD_thesis
$ python evaluation/create_txt_files_val.py
$ cd eval_kitti/build
$ ./evaluate_object val_Frustum-PointNet_1 val (where "val_Frustum-PointNet_1" is experiment_name, set on line 55 in create_txt_files_val.py)
$ cd -
$ cd eval_kitti
$ python parser.py val_Frustum-PointNet_1 val (where "val_Frustum-PointNet_1 val" should be the same as above)
car easy detection 0.842861
car moderate detection 0.811715454545
car hard detection 0.834955454545
----------------
car easy detection_ground 0.884758
car moderate detection_ground 0.815156363636
car hard detection_ground 0.837436363636
----------------
car easy detection_3d 0.707517272727
car moderate detection_3d 0.716832727273
car hard detection_3d 0.679985181818
Run a pretrained Frustum-PointNet, Extended-Frustum-PointNet or Image-Only model on KITTI val, taking 2D detections as input.
Place the created eval_dict_val_2ddetections.pkl in the correct location (see line 78 in create_txt_files_val_2ddetections.py).
$ cd 3DOD_thesis
$ python evaluation/create_txt_files_val_2ddetections.py
$ cd eval_kitti/build
$ ./evaluate_object val_2ddetections_Frustum-PointNet_1 val (where "val_2ddetections_Frustum-PointNet_1" is experiment_name, set on line 55 in create_txt_files_val_2ddetections.py)
$ cd -
$ cd eval_kitti
$ python parser.py val_2ddetections_Frustum-PointNet_1 val (where "val_2ddetections_Frustum-PointNet_1 val" should be the same as above)
car easy detection 0.890627727273
car moderate detection 0.844203727273
car hard detection 0.756144545455
----------------
car easy detection_ground 0.927797272727
car moderate detection_ground 0.861135272727
car hard detection_ground 0.774095636364
----------------
car easy detection_3d 0.848968818182
car moderate detection_3d 0.736132272727
car hard detection_3d 0.703275272727
Run a pretrained Frustum-PointNet, Extended-Frustum-PointNet or Image-Only model on KITTI test.
Place the created eval_dict_test.pkl in the correct location (see line 78 in create_txt_files_test.py).
$ cd 3DOD_thesis
$ python evaluation/create_txt_files_test.py
This will create all the .txt files (placed in 3DOD_thesis/eval_kitti/build/results/test_Frustum-PointNet_1/data) needed to submit to the KITTI 3D object detection leaderboard, see submission instructions.