Introduction

Autonomous Driving has attracted tremendous attention in the last few years. Among the many enabling technologies for autonomous driving, environmental perception is the most relevant to the vision community. As such we host a challenge to understand the current status of computer vision algorithms in solving the environmental perception problems for autonomous driving. In this challenge, we have prepared a number of large scale datasets with fine annotation. Based on the datasets, we have define a set of realistic problems and encourage new algorithms and pipelines to be invented for autonomous driving, rather than applied on autonomous driving.

Prizes

A total amount of 10000 USD cash prize will be awarded to top performers.

in each task will provide 2,500 USD

● 1st place - $1,200

● 2nd place - $800

● 3rd place - $500

Each winner must submit a paper describing their approaches after the competition is closed.

Data Sets

We have collected and annotated two large scale datasets.


The first is provided by Berkeley DeepDrive (BDD). The BDD database includes 100K HD 720P unqiue videos, which is currently the most diverse driving video dataset. All the videos come with GPS/IMU info for driving behavior study. Each video are tagged with weather, scene type and time of the day. The BDD team also extracts key frames from each of the video to label bounding boxes for all the road objects, lane markings, drivable areas and instance segmentation. More information can be found on the BDD database website.


The second data set, ApolloScape data set, is provide by Baidu. ApolloScape contain survey grade dense 3D points and registered multi-view RGB images at video rate, and every pixel and every 3D point are semantically labelled. In addition precise pose for each image is provided.

Task
Task 1: Drivable Area Segmentation
Our first task is drivable area segmentation. This task requires the system to find the road area that the vehicle is driving or it can potentially drive on.
Participate
Task 2: Road Object Detection
This task is to detect the objects that are most relevant for driving policy, more specifically, the following classes of objects are to be detected with bounding box: vehicles, persons, and traffic signs/signals.
Participate
Task 3: Domain Adaption of Semantic Segmentation
BDD dataset and ApolloScape combined have the advantage of covering diverse domains in weather, time of day, and geographic diversity. In this task the participants are given the annotation in one conditions and required to semantically segment test images captured under different conditions. Two types of adaption are to be evaluated. One is on the time/weather conditions; and the other is geographical adaption, more specially training/testing will be from California (USA) and Beijing (China).
Participate
Task4: Instance-level Video Segmentation
In this task, participants are given a set of video sequences with fine per-pixel labeling, in particular instances of moving objects such as vehicles and pedestrians are also label. The goal is to evaluate the state of the art in video—based scene parsing, a task that has not been evaluated previously due to the lack of fine labeling. Some very challenging environments have been captured. The average moving instances per frame can be over 50, in comparisons, only up to 15 cars/pedestrians are labelled in the KITTI dataset.
Participate
#
Team Name
Entries
Public
Private
1
Megvii
202
0.282817
0.339863
2
Super_Camera
45
0.268582
0.302198
3
Autopilot
141
0.282817
0.339863
4
SZU_N606
202
0.282817
0.339863
5
ChiZhouMeiziHao
30
0.213272
0.244754
6
xiaoming
11
0.166215
0.218652
7
P_Beta
35
0.170469
0.215466
8
xiteng
18
0.174742
0.204058
9
iRedRum
35
0.155525
0.203276
10
zwhh
25
0.199501
0.203169
11
William Wang
11
0.199501
0.203169
12
Mhttx
42
0.19723
0.19818
13
G934
3
0.163351
0.198003
14
InnerPeace
13
0.188305
0.197099
15
dishen
28
0.1984
0.193481
16
tkuanlun350
23
0.148988
0.189139
17
Azat Akhtyamov
73
0.165229
0.183628
18
SeetaTech
42
0.174234
0.181225
19
svendroste
43
0.163178
0.180645
20
AlbertoCastaño
13
0.133594
0.177348
21
Holy Fit
25
0.141591
0.171056
22
kekedan
31
0.114786
0.156473
23
xiaoyucoco1314
22
0.115044
0.154638
24
lbin
58
0.105048
0.151376
25
mingye
35
0.107725
0.150658
26
See--
21
0.105065
0.140776
27
Undecided
38
0.0873609
0.131462
28
Eyon
22
0.0998715
0.129144
29
Insight
41
0.113201
0.118306
30
w_________g
16
0.074446
0.110963
31
GodEye
16
0.074446
0.110954
32
bnrc
29
0.074446
0.110945
33
makesense
7
0.0678023
0.108551
34
Mickey
2
0.0673512
0.096149
35
CS231n_Dapeng_Yulun
20
0.0516672
0.0907401
36
XiaokangWang
12
0.0654022
0.0894723
37
didimer
2
0.0510106
0.0875467
38
doudou
4
0.0617735
0.0871252
39
river4321
18
0.0574753
0.0869555
40
RST
15
0.0478252
0.0866909
41
Amber Wang
8
0.0541869
0.0858122
42
Stuttgart_wad
22
0.0593356
0.0839665
43
Big Ideas
8
0.0427501
0.0799891
44
Amit Kumar Jaiswal
3
0.0454816
0.0786631
45
natasa
69
0.0526353
0.0777436
46
Eugene Khvedchenya
11
0.0701647
0.0771423
47
PERO
8
0.0507787
0.0757901
48
kelticss
6
0.0372512
0.0705438
49
Stefanie04736
29
0.0448093
0.0674261
50
bdhurl
12
0.0551469
0.062406
51
Alex Parinov
6
0.0261443
0.0610415
52
Wei Dong
13
0.0381429
0.0568739
53
Forever Young
93
0.0637397
0.0531487
54
iou
2
0.0254168
0.0520568
55
jackkwok
1
0.0235808
0.0496242
56
Zephyr-D
1
0.0254154
0.0471649
57
Marouane Sefiani
25
0.0253831
0.0470956
58
Данил Ахметов
4
0.0254201
0.0470883
59
HEY
6
0.0360775
0.0467348
60
Panthers
6
0.023792
0.0456473
61
ILMGroup
70
0.0114947
0.0362712
62
ILM
4
0.0114947
0.0362699
63
Konstantin Lopuhin
1
0.0258792
0.0355246
64
GomathyShankar
8
0.0161874
0.0312356
65
ZFTurbo
4
0.0201789
0.0273948
66
Vahid Kh
17
0.0223618
0.0262878
67
MediaLab
2
0.0168185
0.0156839
68
King-Fish
28
0.00473903
0.00820239
69
Johan Ahlqvist
13
0.00445622
0.00650419
70
Amor
17
0.00466249
0.00560301
71
tzt
8
0.00726843
0.00546945
72
yzo
18
0.00803572
0.00245352
73
Christian M
18
0.0107221
0.00168529
74
Minesh A. Jethva
1
0
0.00115897
75
abdelatif
3
0
0.00112655
76
soneo
3
0
0.00108781
77
Chris Jurkowski
1
0
0.000894704
78
Niranjan Singh
1
0
0.000894704
79
Victor Zhao
1
0
0.000894704
80
Swastik Biswas
4
0
0.000894704
81
Zhe Wu
4
0
0.000894704
82
Niranjan Nakkala
3
0
0.000893572
83
Nandanam
1
0
0.000818143
84
Waseem
74
0.00515904
0.000248456
85
Wei Zheng
3
0.000235849
0.000213373
86
niuwagege
41
0
0.0000452846
87
aashish malik
7
0
0.0000452846
88
Dan
1
0
0.0000452846
89
Marek
5
0
0.0000452846
90
Manny Bhidya
2
0
0
91
MykolaSharhan
1
0
0
92
TetyanaYatsenko
3
0
0
93
Mathurin Ach茅
1
0
0
94
the1owl
3
0
0
95
Arunkumar Ramanan
1
0
0
96
JohnM
2
0
0
97
Biao
1
0
0
98
林湧森 (Dyson Lin)
1
0
0
99
zhuxin05
1
0
0
100
Schmendrick
1
0
0
101
MachineE!
1
0
0
102
Patrick DeKelly
1
0
0
103
Vladimir Osin
1
0
0
104
NaomiFridman
17
0
0
105
Justin
1
0
0
106
ceci n'est pas une image
9
0
0
107
Kevin Mader
11
0
0
108
Bojan Tunguz
1
0
0
109
zmingen
1
0
0
110
WeijieYu
10
0
0
111
lseiyjg
1
0
0
112
Shih-Chang Chen
1
0
0
113
Luis Alberto Rosero
1
0
0
114
liguangchuang
1
0
0
115
Telcontar120
1
0
0
116
joongjum
1
0
0
117
Kiwi
1
0
0
118
jiancao
2
0
0
119
eric
1
0
0
120
sabari nathan
2
0
0
121
Alexander Teplyuk
1
0
0
122
gunanR
1
0
0
123
lang
3
0
0
124
Sujeeth S
1
0
0
125
NewWorld
1
0
0
126
Cognitio
14
0
0
127
specialist
1
0
0
128
Data Enthu
1
0
0
129
Bharat Garg
10
0
0
130
Olga Ivanova
4
0
0
131
Vidhika Lonare
1
0
0
132
MD. JAMIL-UR RAHMAN
1
0
0
133
Hanlun
1
0
0
134
Zubin Pahuja
1
0
0
135
teetu
9
0
0
136
2wsx2wsx2wsx
1
0
0
137
metalearn
1
0
0
138
Alexander Popov
1
0
0
139
ssb
1
0
0
140
victorsdnu
5
0
0
141
musaprg
1
0
0
brandonma
0
Competition Wizard Sample Submission Playground
0
Data Craze
1
dession
0
DIDI-AI
0
driveME
7
Eugenio
0
kepiscea
1
Kuntal Sardar
1
NV-ADLR
0
Qichuan Geng
3
Rashmi Margani
0
royal
1
shansiliu
4
Sogo_MM
17
zdtu
0
zgljl2012
1
ZJU-GIVE
0
Winner

Task 1: Drivable Area Segmentation.

Contact
Team Name
Score
Organization
Method Description
Xingang Pan
IBN_PSA/P
86.18
CUHK, SenseTime, Tencent
Our method is based on Instance-Batch, Normalization Network (IBN-Net) and Point-wise Spatial Attentio, Network(PSANet)
Peter Kontschieder
Mapillary Research
86.04
Mapillary Research
In-Place Activated BatchNorm for Memory-Optimized Training of DNNs. http://research.mapillary.com/publication/cvpr18a/
Qiaoyi Li
DiDi AI Labs
84.01
DiDi AI Labs
based on Deeplabv3+

Task 2: Road Object Detection.

Contact
Team Name
Score
Organization
Method Description
Aysegul Dundar
NvDA
62.4
Nvidia
DeepLabV3 as the base network. MUNIT for translating input images from the BDD domain to the Apollo domain for training DeepLabV3 for the Apollo domain. Class balanced pseudo labeling for iteratively estimating pixel labels in the Apollo domain for network fine tuning.
Zhengping Che
DiDi AI Labs
57.67
DiDi AI Labs
Deeplabv3+ , adversarial training, multi-scale training, and ensemble
Yang Zou
CMU-GM
54.59
Carnegie Mellon University and General Motors
We use ResNet-38 [1] which is pretrained on ImageNet as our base model. Then we firstly use labeled images in bdd100k/seg/images/train to train the model for segmentation. Therafter we use unlabeled Apollo training images (https://www.kaggle.com/c/cvpr-2018-autonomous-driving/data) mixed with bdd100k/seg/images/train to train our model in a semi-supervised way. We use a class-balanced self-training with spatial priors as our semi-supervised learning method to improve our model. More details will be posted on our poster in WAD.

Task 3: Domain Adaption of Semantic Segmentation

Contact
Team Name
Score
Organization
Method Description
Tao Wei
Sogou_MM
33.1
Sogou
In this challenge, we choose the faster rcnn as our main framework. At fisrt, we analysed the dataset and determine the range of the object scales and aspect ratios. Then we tuning the algorithm as the steps below. We choose the resnet101 as our backbone network. Because it is deep enough and also can be training in a short time. We choose Faster Rcnn as our main framework as the two stage framework always has better precision. We use roialign layer instead of roipooling layer, because it will be benefit to location precision. Multiscale training and multiscale testing, it will make our model be more robust to scale varience. Multimodel ensemble. Different model always show different performance on same dataset, so at last, we fusion the results generated by different models, it make a big improvement on our results. This is our work description, however, there are still be a lot of works we haven't done, so we think we still have a lot of works to do to improve the results.
Qijie Zhao
VDIG0
29.69
Institute of computer science & technology of Peking University ,VDIG
CFENet: Exploiting a Real Effective Single Shot Object Detector with Comprehensive Feature Enhancement module
Sebastian Bayer
seb
20.66
Karlsruhe Institute of Technology
A Mask-RCNN network with Resnet-101 backbone. The backbone was initialized with ImageNet-pretrained Resnet-50 weights. The backbone uses batch-norm and everything else group-norm. To increase scores for rarer classes, images containing trains were sampled 20 times as often during training and images containing motorbikes, bicycles or riders were sampled twice as often as other images. The model was trained on 4 GPUs with 2 images per GPU. All weights except for the heads ​​were frozen for the first 1.5 epochs of training, followed by 0.5 epochs of warmup with a learning rate reduced by a factor of 1/100. After this warmup-phase, training resumed normally with a learning rate decrease by a factor of 10 shortly before ​the end of training.

Task 4: Instance-level Video Segmentation.

Contact
Team Name
Score
Organization
Method Description
zengarden
Megvii
0.339862734079361
Tommy.Zhuang
Megvii
0.339862734079361
sx
Super_Camera
0.302198469638824
NV-ADLR
NV-ADLR
0.267991662025452