自动驾驶数据集:Cityscapes和kitti

KITTI和Cityscapes是两大自动驾驶视觉算法评测数据集。KITTI覆盖多种驾驶场景,侧重于目标检测与追踪;Cityscapes则专注于城区场景的图像分割,提供了丰富多样的标注类别。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

KITTI数据集由德国卡尔斯鲁厄理工学院和丰田美国技术研究院联合创办,是目前国际上最大的自动驾驶场景下的计算机视觉算法评测数据集。用于评测目标(机动车、非机动车、行人等)检测、目标跟踪、路面分割等计算机视觉技术在车载环境下的性能。
KITTI包含市区、乡村和高速公路等场景采集的真实图像数据,每张图像中多达15辆车和30个行人,还有各种程度的遮挡。KITTI数据集中,目标检测包括了车辆检测、行人检测、自行车等三个单项,目标追踪包括车辆追踪、行人追踪等两个单项,道路分割包括urban unmarked、urban marked、urban multiple marked三个场景及前三个场景的平均值urban road等四个单项。
总体上看,原始数据集被分类为’Road’, ’City’, ’Residential’, ’Campus’ 和 ’Person’。对于3D物体检测,label细分为car, van, truck, pedestrian, pedestrian(sitting), cyclist, tram以及misc组成。

The label files contain the following information, which can be read and
written using the matlab tools (readLabels.m, writeLabels.m) provided within
this devkit. All values (numerical or strings) are separated via spaces,
each row corresponds to one object. The 15 columns represent:

#Values    Name      Description
----------------------------------------------------------------------------
  1    type         Describes the type of object: 'Car', 'Van', 'Truck',
                    'Pedestrian', 'Person_sitting', 'Cyclist', 'Tram',
                    'Misc' or 'DontCare'
  1    truncated    Float from 0 (non-truncated) to 1 (truncated), where
                    truncated refers to the object leaving image boundaries
  1    occluded     Integer (0,1,2,3) indicating occlusion state:
                    0 = fully visible, 1 = partly occluded
                    2 = largely occluded, 3 = unknown
  1    alpha        Observation angle of object, ranging [-pi..pi]
  4    bbox         2D bounding box of object in the image (0-based index):
                    contains left, top, right, bottom pixel coordinates
  3    dimensions   3D object dimensions: height, width, length (in meters)
  3    location     3D object location x,y,z in camera coordinates (in meters)
  1    rotation_y   Rotation ry around Y-axis in camera coordinates [-pi..pi]
  1    score        Only for results: Float, indicating confidence in
                    detection, needed for p/r curves, higher is better.

Cityscapes数据集则是由奔驰主推,提供无人驾驶环境下的图像分割数据集。用于评估视觉算法在城区场景语义理解方面的性能。Cityscapes包含50个城市不同场景、不同背景、不同季节的街景,提供5000张精细标注的图像、20000张粗略标注的图像、30类标注物体。用PASCAL VOC标准的 intersection-over-union (IoU)得分来对算法性能进行评价。 Cityscapes数据集共有fine和coarse两套评测标准,前者提供5000张精细标注的图像,后者提供5000张精细标注外加20000张粗糙标注的图像。

cityscapes数据集有30多类标注物体

List of cityscapes labels:

# Please adapt the train IDs as appropriate for your approach.
# Note that you might want to ignore labels with ID 255 during training.
# Further note that the current train IDs are only a suggestion. You can use whatever you like.
# Make sure to provide your results using the original IDs and not the training IDs.
# Note that many IDs are ignored in evaluation and thus you never need to predict these!
 

                     name |  id | trainId |       category | categoryId | hasInstances | ignoreInEval|        color
    --------------------------------------------------------------------------------------------------
                unlabeled |   0 |     255 |           void |          0 |            0 |            1 |         (0, 0, 0)
              ego vehicle |   1 |     255 |           void |          0 |            0 |            1 |         (0, 0, 0)
     rectification border |   2 |     255 |           void |          0 |            0 |            1 |         (0, 0, 0)
               out of roi |   3 |     255 |           void |          0 |            0 |            1 |         (0, 0, 0)
                   static |   4 |     255 |           void |          0 |            0 |            1 |         (0, 0, 0)
                  dynamic |   5 |     255 |           void |          0 |            0 |            1 |      (111, 74, 0)
                   ground |   6 |     255 |           void |          0 |            0 |            1 |       (81, 0, 81)
                     road |   7 |       0 |           flat |          1 |            0 |            0 |    (128, 64, 128)
                 sidewalk |   8 |       1 |           flat |          1 |            0 |            0 |    (244, 35, 232)
                  parking |   9 |     255 |           flat |          1 |            0 |            1 |   (250, 170, 160)
               rail track |  10 |     255 |           flat |          1 |            0 |            1 |   (230, 150, 140)
                 building |  11 |       2 |   construction |          2 |            0 |            0 |      (70, 70, 70)
                     wall |  12 |       3 |   construction |          2 |            0 |            0 |   (102, 102, 156)
                    fence |  13 |       4 |   construction |          2 |            0 |            0 |   (190, 153, 153)
               guard rail |  14 |     255 |   construction |          2 |            0 |            1 |   (180, 165, 180)
                   bridge |  15 |     255 |   construction |          2 |            0 |            1 |   (150, 100, 100)
                   tunnel |  16 |     255 |   construction |          2 |            0 |            1 |    (150, 120, 90)
                     pole |  17 |       5 |         object |          3 |            0 |            0 |   (153, 153, 153)
                polegroup |  18 |     255 |         object |          3 |            0 |            1 |   (153, 153, 153)
            traffic light |  19 |       6 |         object |          3 |            0 |            0 |    (250, 170, 30)
             traffic sign |  20 |       7 |         object |          3 |            0 |            0 |     (220, 220, 0)
               vegetation |  21 |       8 |         nature |          4 |            0 |            0 |    (107, 142, 35)
                  terrain |  22 |       9 |         nature |          4 |            0 |            0 |   (152, 251, 152)
                      sky |  23 |      10 |            sky |          5 |            0 |            0 |    (70, 130, 180)
                   person |  24 |      11 |          human |          6 |            1 |            0 |     (220, 20, 60)
                    rider |  25 |      12 |          human |          6 |            1 |            0 |       (255, 0, 0)
                      car |  26 |      13 |        vehicle |          7 |            1 |            0 |       (0, 0, 142)
                    truck |  27 |      14 |        vehicle |          7 |            1 |            0 |        (0, 0, 70)
                      bus |  28 |      15 |        vehicle |          7 |            1 |            0 |      (0, 60, 100)
                  caravan |  29 |     255 |        vehicle |          7 |            1 |            1 |        (0, 0, 90)
                  trailer |  30 |     255 |        vehicle |          7 |            1 |            1 |       (0, 0, 110)
                    train |  31 |      16 |        vehicle |          7 |            1 |            0 |      (0, 80, 100)
               motorcycle |  32 |      17 |        vehicle |          7 |            1 |            0 |       (0, 0, 230)
                  bicycle |  33 |      18 |        vehicle |          7 |            1 |            0 |     (119, 11, 32)
            license plate |  -1 |      -1 |        vehicle |          7 |            0 |            1 |       (0, 0, 142)
            
### CityScapes 数据集类别及其对应的 RGB 值 CityScapes 数据集是一个广泛用于语义分割任务的数据集,其中包含了多种城市场景中的对象类别。这些类别的颜色编码通常以 RGB 值的形式表示,便于可视化分析。以下是 CityScapes 数据集中常见类别的标准 RGB 配置[^2]: | 类别名称 | RGB 值 | |------------------|----------------| | 路面 (road) | (128, 64, 128) | | 人行道 (sidewalk)| (244, 35, 232) | | 建筑物 (building)| (70, 70, 70) | | 树木 (tree) | (107, 142, 35) | | 天空 (sky) | (70, 130, 180) | | 行人 (person) | (220, 20, 60) | | 汽车 (car) | (0, 0, 142) | 上述表格仅列出了部分常见的类别以及它们的标准 RGB 编码。完整的类别列表可以参考官方文档或通过 `gtFine` 文件夹下的标注图像查看其 RGBA 的具体实现[^1]。 对于实际应用中需要自定义数据集的情况,可以通过工具如 labelme 进行标注并将其转换为 CityScapes 格式以便于训练模型[^3]。如果希望进一步扩展功能或者调整类别标签的颜色映射关系,则可以在预处理阶段修改对应配置文件来适配新的需求。 ```python import numpy as np from PIL import Image def load_cityscapes_label_colors(): """加载 CityScapes 数据集类别与 RGB 映射""" colors = { 'road': (128, 64, 128), 'sidewalk': (244, 35, 232), 'building': (70, 70, 70), 'tree': (107, 142, 35), 'sky': (70, 130, 180), 'person': (220, 20, 60), 'car': (0, 0, 142) } return colors # 示例:读取一张带有 alpha 通道的 PNG 图像 image_path = "aachen_000000_000019_gtFine_color.png" img_array = np.array(Image.open(image_path)) print(f"Image Shape: {img_array.shape}") # 输出形状应为 (H, W, 4),即包含 Alpha 通道 ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值