Rosbag2 匿名器#
概述#
Autoware 提供了一个工具 (autoware_rosbag2_anonymizer) 来匿名化 ROS 2 包文件. 当您想与 Autoware 社区共享数据但希望保持数据隐私时,此工具非常有用.
使用此工具,您可以模糊包文件中的任何对象(人脸、车牌等),并且可以获得新的包文件 与模糊的图像.
安装#
克隆仓库#
git clone https://github.com/autowarefoundation/autoware_rosbag2_anonymizer.git
cd autoware_rosbag2_anonymizer
下载预训练模型#
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
wget https://huggingface.co/ShilongLiu/GroundingDINO/resolve/main/GroundingDINO_SwinB.cfg.py
wget https://huggingface.co/ShilongLiu/GroundingDINO/resolve/main/groundingdino_swinb_cogcoor.pth
wget https://github.com/autowarefoundation/autoware_rosbag2_anonymizer/releases/download/v0.0.0/yolov8x_anonymizer.pt
wget https://github.com/autowarefoundation/autoware_rosbag2_anonymizer/releases/download/v0.0.0/yolo_config.yaml
如果你要使用 mcap 文件,请安装 ROS 2 mcap 依赖项#
警告
确保您已在系统上安装了 ROS 2.
sudo apt install ros-humble-rosbag2-storage-mcap
安装 autoware_rosbag2_anonymizer 工具#
在安装该工具之前,您应该更新 pip 包管理器.
python3 -m pip install pip -U
然后,您可以使用以下命令安装该工具.
python3 -m pip install .
配置#
在 validation.json 文件中定义提示.该工具将使用这些提示来检测对象.您可以添加提示
作为 prompts 键下的字典.每个字典应有两个键:
prompt:将用于检测对象的提示.此提示在匿名化过程中会变得模糊.should_inside:这是对象应位于其中的提示列表.如果对象不在提示符内,则 工具不会对对象进行模糊处理.
{
"prompts": [
{
"prompt": "license plate",
"should_inside": ["car", "bus", "..."]
},
{
"prompt": "human face",
"should_inside": ["person", "human body", "..."]
}
]
}
您应该根据使用情况在 config 文件夹下的配置文件中设置您的配置.以后 说明将指导您设置每个配置文件.
config/anonymize_with_unified_model.yaml
rosbag:
input_bags_folder: "/path/to/input_bag_folder" # Path to the input folder which contains ROS 2 bag files
output_bags_folder: "/path/to/output_folder" # Path to the output ROS 2 bag folder
output_save_compressed_image: True # Save images as compressed images (True or False)
output_storage_id: "sqlite3" # Storage id for the output bag file (`sqlite3` or `mcap`)
grounding_dino:
box_threshold: 0.1 # Threshold for the bounding box (float)
text_threshold: 0.1 # Threshold for the text (float)
nms_threshold: 0.1 # Threshold for the non-maximum suppression (float)
open_clip:
score_threshold: 0.7 # Validity threshold for the OpenCLIP model (float
yolo:
confidence: 0.15 # Confidence threshold for the YOLOv8 model (float)
bbox_validation:
iou_threshold: 0.9 # Threshold for the intersection over union (float), if the intersection over union is greater than this threshold, the object will be selected as inside the validation prompt
blur:
kernel_size: 31 # Kernel size for the Gaussian blur (int)
sigma_x: 11 # Sigma x for the Gaussian blur (int)
config/yolo_create_dataset.yaml
rosbag:
input_bags_folder: "/path/to/input_bag_folder" # Path to the input ROS 2 bag files folder
dataset:
output_dataset_folder: "/path/to/output/dataset" # Path to the output dataset folder
output_dataset_subsample_coefficient: 25 # Subsample coefficient for the dataset (int)
grounding_dino:
box_threshold: 0.1 # Threshold for the bounding box (float)
text_threshold: 0.1 # Threshold for the text (float)
nms_threshold: 0.1 # Threshold for the non-maximum suppression (float)
open_clip:
score_threshold: 0.7 # Validity threshold for the OpenCLIP model (float
bbox_validation:
iou_threshold: 0.9 # Threshold for the intersection over union (float), if the intersection over union is greater than this threshold, the object will be selected as inside the validation prompt
config/yolo_train.yaml
dataset:
input_dataset_yaml: "path/to/data.yaml" # Path to the config file of the dataset, which is created in the previous step
yolo:
epochs: 100 # Number of epochs for the YOLOv8 model (int)
model: "yolov8x.pt" # Select the base model for YOLOv8 (`yolov8x.pt` `yolov8l.pt`, `yolov8m.pt`, `yolov8n.pt`)
config/yolo_anonymize.yaml
rosbag:
input_bag_path: "/path/to/input_bag/bag.mcap" # Path to the input ROS 2 bag file with `mcap` or `sqlite3` extension
output_bag_path: "/path/to/output_bag_file" # Path to the output ROS 2 bag folder
output_save_compressed_image: True # Save images as compressed images (True or False)
output_storage_id: "sqlite3" # Storage id for the output bag file (`sqlite3` or `mcap`)
yolo:
model: "path/to/yolo/model" # Path to the trained YOLOv8 model file (`.pt` extension) (you can download the pre-trained model from releases)
config_path: "path/to/input/data.yaml" # Path to the config file of the dataset, which is created in the previous step
confidence: 0.15 # Confidence threshold for the YOLOv8 model (float)
blur:
kernel_size: 31 # Kernel size for the Gaussian blur (int)
sigma_x: 11 # Sigma x for the Gaussian blur (int)
用法#
该工具提供了两个选项来匿名化 ROS 2 bag 文件中的图像.
警告
如果您的 ROS 2 包文件包含来自 Autoware 或任何其他包的自定义消息类型,您应该获取 他们的工作区.
您可以使用以下命令获取 Autoware 工作区.
source /path/to/your/workspace/install/setup.bash
选项 1:使用统一模型进行匿名化#
您应该提供单个 rosbag 和工具,使用统一的模型在 rosbag 中匿名化图像.模型是组合 GroundingDINO、OpenCLIP、YOLOv8 和 SegmentAnything 的.如果您不想使用预先训练的 YOLOv8 模型,您可以 按照第二个选项中的说明训练您自己的 YOLOv8 模型.
您应该在 config/anonymize_with_unified_model.yaml 文件中设置您的配置.
python3 main.py config/anonymize_with_unified_model.yaml --anonymize_with_unified_model
选项 2:使用在通过统一模型创建的数据集上训练的 YOLOv8 模型进行匿名化#
第 1 步:创建数据集#
使用统一模型创建初始数据集.您可以提供多个 ROS 2 bag 文件来创建数据集.后 运行以下命令,该工具将创建 YOLO 格式的数据集.
您应该在 config/yolo_create_dataset.yaml 文件中设置您的配置.
python3 main.py config/yolo_create_dataset.yaml --yolo_create_dataset
第 2 步:手动标记缺失的标签#
在第一步中创建的数据集缺少一些标签.您应该手动标记缺失的标签. 您可以使用以下示例工具来标记缺失的标签:
第 3 步:拆分数据集#
将数据集拆分为训练集和验证集.提供在第一个 步.
autoware-rosbag2-anonymizer-split-dataset /path/to/dataset/folder
第 4 步:训练 YOLOv8 模型#
使用第一步中创建的数据集训练 YOLOv8 模型.
您应该在 config/yolo_train.yaml 文件中设置您的配置.
python3 main.py config/yolo_train.yaml --yolo_train
第 5 步:对 ROS 2 Bag 文件中的图像进行匿名化处理#
使用经过训练的 YOLOv8 模型对 ROS 2 bag 文件中的图像进行匿名化处理.如果您只想使用 YOLOv8 模型,您应该使用以下命令.但我们建议使用统一模型以获得更好的结果.您可以 按照选项 1 对统一模型使用您训练的 YOLOv8 模型进行作.
您应该在 config/yolo_anonymize.yaml 文件中设置您的配置.
python3 main.py config/yolo_anonymize.yaml --yolo_anonymize
故障排除#
- 错误 1:
torch.OutOfMemoryError: CUDA 内存不足
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 1024.00 MiB. GPU 0 has a total capacity of 10.87 GiB of which 1010.88 MiB is free. Including non-PyTorch memory, this process has 8.66 GiB memory in use. Of the allocated memory 8.21 GiB is allocated by PyTorch, and 266.44 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
当 GPU 内存不足以运行模型时,会出现此错误.您可以将以下环境变量添加到 避免此错误.
export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
分享您的匿名数据#
对数据进行匿名化处理后,您可以与 Autoware 社区共享匿名数据.如果您想共享 数据中,您应该创建一个问题和拉取请求 Autoware 文档存储库.
报价单#
@article{liu2023grounding,
title={Grounding dino: Marrying dino with grounded pre-training for open-set object detection},
author={Liu, Shilong and Zeng, Zhaoyang and Ren, Tianhe and Li, Feng and Zhang, Hao and Yang, Jie and Li, Chunyuan and Yang, Jianwei and Su, Hang and Zhu, Jun and others},
journal={arXiv preprint arXiv:2303.05499},
year={2023}
}
@article{kirillov2023segany,
title={Segment Anything},
author={Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{\`a}r, Piotr and Girshick, Ross},
journal={arXiv:2304.02643},
year={2023}
}
@software{ilharco_gabriel_2021_5143773,
author = {Ilharco, Gabriel and
Wortsman, Mitchell and
Wightman, Ross and
Gordon, Cade and
Carlini, Nicholas and
Taori, Rohan and
Dave, Achal and
Shankar, Vaishaal and
Namkoong, Hongseok and
Miller, John and
Hajishirzi, Hannaneh and
Farhadi, Ali and
Schmidt, Ludwig},
title = {OpenCLIP},
month = jul,
year = 2021,
note = {If you use this software, please cite it as below.},
publisher = {Zenodo},
version = {0.1},
doi = {10.5281/zenodo.5143773},
url = {https://doi.org/10.5281/zenodo.5143773}
}