Rosbag2 匿名器#

概述#

Autoware 提供了一个工具 (autoware_rosbag2_anonymizer) 来匿名化 ROS 2 包文件. 当您想与 Autoware 社区共享数据但希望保持数据隐私时,此工具非常有用.

使用此工具,您可以模糊包文件中的任何对象(人脸、车牌等),并且可以获得新的包文件与模糊的图像.

安装#

克隆仓库#

git clone https://github.com/autowarefoundation/autoware_rosbag2_anonymizer.git
cd autoware_rosbag2_anonymizer

下载预训练模型#

wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

wget https://huggingface.co/ShilongLiu/GroundingDINO/resolve/main/GroundingDINO_SwinB.cfg.py
wget https://huggingface.co/ShilongLiu/GroundingDINO/resolve/main/groundingdino_swinb_cogcoor.pth

wget https://github.com/autowarefoundation/autoware_rosbag2_anonymizer/releases/download/v0.0.0/yolov8x_anonymizer.pt
wget https://github.com/autowarefoundation/autoware_rosbag2_anonymizer/releases/download/v0.0.0/yolo_config.yaml

如果你要使用 mcap 文件,请安装 ROS 2 mcap 依赖项#

警告

确保您已在系统上安装了 ROS 2.

sudo apt install ros-humble-rosbag2-storage-mcap

安装 autoware_rosbag2_anonymizer 工具#

在安装该工具之前,您应该更新 pip 包管理器.

python3 -m pip install pip -U

然后,您可以使用以下命令安装该工具.

python3 -m pip install .

配置#

在 validation.json 文件中定义提示.该工具将使用这些提示来检测对象.您可以添加提示作为 prompts 键下的字典.每个字典应有两个键：

prompt：将用于检测对象的提示.此提示在匿名化过程中会变得模糊.
should_inside：这是对象应位于其中的提示列表.如果对象不在提示符内,则工具不会对对象进行模糊处理.

{
  "prompts": [
    {
      "prompt": "license plate",
      "should_inside": ["car", "bus", "..."]
    },
    {
      "prompt": "human face",
      "should_inside": ["person", "human body", "..."]
    }
  ]
}

您应该根据使用情况在 config 文件夹下的配置文件中设置您的配置.以后说明将指导您设置每个配置文件.

config/anonymize_with_unified_model.yaml

rosbag:
  input_bags_folder: "/path/to/input_bag_folder" # Path to the input folder which contains ROS 2 bag files
  output_bags_folder: "/path/to/output_folder" # Path to the output ROS 2 bag folder
  output_save_compressed_image: True # Save images as compressed images (True or False)
  output_storage_id: "sqlite3" # Storage id for the output bag file (`sqlite3` or `mcap`)

grounding_dino:
  box_threshold: 0.1 # Threshold for the bounding box (float)
  text_threshold: 0.1 # Threshold for the text (float)
  nms_threshold: 0.1 # Threshold for the non-maximum suppression (float)

open_clip:
  score_threshold: 0.7 # Validity threshold for the OpenCLIP model (float

yolo:
  confidence: 0.15 # Confidence threshold for the YOLOv8 model (float)

bbox_validation:
  iou_threshold: 0.9 # Threshold for the intersection over union (float), if the intersection over union is greater than this threshold, the object will be selected as inside the validation prompt

blur:
  kernel_size: 31 # Kernel size for the Gaussian blur (int)
  sigma_x: 11 # Sigma x for the Gaussian blur (int)

config/yolo_create_dataset.yaml

rosbag:
  input_bags_folder: "/path/to/input_bag_folder" # Path to the input ROS 2 bag files folder

dataset:
  output_dataset_folder: "/path/to/output/dataset" # Path to the output dataset folder
  output_dataset_subsample_coefficient: 25 # Subsample coefficient for the dataset (int)

grounding_dino:
  box_threshold: 0.1 # Threshold for the bounding box (float)
  text_threshold: 0.1 # Threshold for the text (float)
  nms_threshold: 0.1 # Threshold for the non-maximum suppression (float)

open_clip:
  score_threshold: 0.7 # Validity threshold for the OpenCLIP model (float

bbox_validation:
  iou_threshold: 0.9 # Threshold for the intersection over union (float), if the intersection over union is greater than this threshold, the object will be selected as inside the validation prompt

config/yolo_train.yaml

dataset:
  input_dataset_yaml: "path/to/data.yaml" # Path to the config file of the dataset, which is created in the previous step

yolo:
  epochs: 100 # Number of epochs for the YOLOv8 model (int)
  model: "yolov8x.pt" # Select the base model for YOLOv8 (`yolov8x.pt` `yolov8l.pt`, `yolov8m.pt`, `yolov8n.pt`)

config/yolo_anonymize.yaml

rosbag:
  input_bag_path: "/path/to/input_bag/bag.mcap" # Path to the input ROS 2 bag file with `mcap` or `sqlite3` extension
  output_bag_path: "/path/to/output_bag_file" # Path to the output ROS 2 bag folder
  output_save_compressed_image: True # Save images as compressed images (True or False)
  output_storage_id: "sqlite3" # Storage id for the output bag file (`sqlite3` or `mcap`)

yolo:
  model: "path/to/yolo/model" # Path to the trained YOLOv8 model file (`.pt` extension) (you can download the pre-trained model from releases)
  config_path: "path/to/input/data.yaml" # Path to the config file of the dataset, which is created in the previous step
  confidence: 0.15 # Confidence threshold for the YOLOv8 model (float)

blur:
  kernel_size: 31 # Kernel size for the Gaussian blur (int)
  sigma_x: 11 # Sigma x for the Gaussian blur (int)

用法#

该工具提供了两个选项来匿名化 ROS 2 bag 文件中的图像.

警告

如果您的 ROS 2 包文件包含来自 Autoware 或任何其他包的自定义消息类型,您应该获取他们的工作区.

您可以使用以下命令获取 Autoware 工作区.

source /path/to/your/workspace/install/setup.bash

选项 1：使用统一模型进行匿名化#

您应该提供单个 rosbag 和工具,使用统一的模型在 rosbag 中匿名化图像.模型是组合 GroundingDINO、OpenCLIP、YOLOv8 和 SegmentAnything 的.如果您不想使用预先训练的 YOLOv8 模型,您可以按照第二个选项中的说明训练您自己的 YOLOv8 模型.

您应该在 config/anonymize_with_unified_model.yaml 文件中设置您的配置.

python3 main.py config/anonymize_with_unified_model.yaml --anonymize_with_unified_model

选项 2：使用在通过统一模型创建的数据集上训练的 YOLOv8 模型进行匿名化#

第 1 步：创建数据集#

使用统一模型创建初始数据集.您可以提供多个 ROS 2 bag 文件来创建数据集.后运行以下命令,该工具将创建 YOLO 格式的数据集.

您应该在 config/yolo_create_dataset.yaml 文件中设置您的配置.

python3 main.py config/yolo_create_dataset.yaml --yolo_create_dataset

第 2 步：手动标记缺失的标签#

在第一步中创建的数据集缺少一些标签.您应该手动标记缺失的标签. 您可以使用以下示例工具来标记缺失的标签：

标签工作室
Roboflow (您可以使用免费版本)

第 3 步：拆分数据集#

将数据集拆分为训练集和验证集.提供在第一个步.

autoware-rosbag2-anonymizer-split-dataset /path/to/dataset/folder

第 4 步：训练 YOLOv8 模型#

使用第一步中创建的数据集训练 YOLOv8 模型.

您应该在 config/yolo_train.yaml 文件中设置您的配置.

python3 main.py config/yolo_train.yaml --yolo_train

第 5 步：对 ROS 2 Bag 文件中的图像进行匿名化处理#

使用经过训练的 YOLOv8 模型对 ROS 2 bag 文件中的图像进行匿名化处理.如果您只想使用 YOLOv8 模型,您应该使用以下命令.但我们建议使用统一模型以获得更好的结果.您可以按照选项 1 对统一模型使用您训练的 YOLOv8 模型进行作.

您应该在 config/yolo_anonymize.yaml 文件中设置您的配置.

python3 main.py config/yolo_anonymize.yaml --yolo_anonymize

故障排除#

错误 1： torch.OutOfMemoryError： CUDA 内存不足

torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1024.00 MiB. GPU 0 has a total capacity of 10.87 GiB of which 1010.88 MiB is free. Including non-PyTorch memory, this process has 8.66 GiB memory in use. Of the allocated memory 8.21 GiB is allocated by PyTorch, and 266.44 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

当 GPU 内存不足以运行模型时,会出现此错误.您可以将以下环境变量添加到避免此错误.

export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True

分享您的匿名数据#

对数据进行匿名化处理后,您可以与 Autoware 社区共享匿名数据.如果您想共享数据中,您应该创建一个问题和拉取请求 Autoware 文档存储库.

报价单#

@article{liu2023grounding,
  title={Grounding dino: Marrying dino with grounded pre-training for open-set object detection},
  author={Liu, Shilong and Zeng, Zhaoyang and Ren, Tianhe and Li, Feng and Zhang, Hao and Yang, Jie and Li, Chunyuan and Yang, Jianwei and Su, Hang and Zhu, Jun and others},
  journal={arXiv preprint arXiv:2303.05499},
  year={2023}
}

@article{kirillov2023segany,
  title={Segment Anything},
  author={Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{\`a}r, Piotr and Girshick, Ross},
  journal={arXiv:2304.02643},
  year={2023}
}

@software{ilharco_gabriel_2021_5143773,
  author       = {Ilharco, Gabriel and
                  Wortsman, Mitchell and
                  Wightman, Ross and
                  Gordon, Cade and
                  Carlini, Nicholas and
                  Taori, Rohan and
                  Dave, Achal and
                  Shankar, Vaishaal and
                  Namkoong, Hongseok and
                  Miller, John and
                  Hajishirzi, Hannaneh and
                  Farhadi, Ali and
                  Schmidt, Ludwig},
  title        = {OpenCLIP},
  month        = jul,
  year         = 2021,
  note         = {If you use this software, please cite it as below.},
  publisher    = {Zenodo},
  version      = {0.1},
  doi          = {10.5281/zenodo.5143773},
  url          = {https://doi.org/10.5281/zenodo.5143773}
}