Skip to content

YuqiYang213/TaskDiffusion

Repository files navigation

[ICLR2025] Multi-Task Dense Predictions via Unleashing the Power of Diffusion

Abstract

We provide the code for TaskDiffusion, a novel multi-task dense prediction framework based on diffusion models. Our code is implemented on PASCAL-Context and NYUD-v2 based on ViT.

  • TaskDiffusion builds a novel decoder module based on diffusion model that can capture the underlying conditional distribution of the prediction.
  • To further unlock the potential of diffusion models in solving multi-task dense predictions, TaskDiffusion introduces a novel joint denoising diffusion process to capture the task relations during denoising.
  • Our proposed TaskDiffusion achieves a new state-of-the-art (SOTA) performance with superior efficiency on PASCAL-Context and NYUD-v2.

Please check the paper for more details.

img-name
Framework overview of the proposed TaskDiffusion for multi-task scene understanding.

Installation

1. Environment

You can use the following command to prepare your environment.

conda create -n taskdiffusion python=3.7
conda activate taskdiffusion
pip install tqdm Pillow==9.5 easydict pyyaml imageio scikit-image tensorboard six
pip install opencv-python==4.7.0.72 setuptools==59.5.0

pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install timm==0.5.4 einops==0.4.1

2. Data

You can download the PASCAL-Context and NYUD-v2 from ATRC's repository in PASCALContext.tar.gz, NYUDv2.tar.gz:

PASCAL-Context

tar xfvz PASCALContext.tar.gz

NYUD-v2

tar xfvz NYUDv2.tar.gz

Attention: you need to specify the root directory of your own datasets as db_root variable in configs/mypath.py.

3. Training

You can train your own model by using the following commands. PASCAL-Context:

bash run_TaskDiffusion_pascal.sh

NYUD-v2

bash run_TaskDiffusion_nyud.sh

If you want to train your model based on ViT-Base, you can modify the --config_exp in .sh file.

You can also modify the output directory in ./configs.

4. Evaluate the model

The training script itself includes evaluation. For inferring with pre-trained models, you can use the following commands. PASCAL-Context:

bash infer_TaskDiffusion_pascal.sh

NYUD-v2

bash infer_TaskDiffusion_nyud.sh

For the evaluation of boundary, you can use the evaluation tools in this repo following TaskPrompter.

Pre-trained models

We provide the pretrained models on PASCAL-Context and NYUD-v2.

Download pre-trained models

Version Dataset Download Depth (RMSE) Segmentation (mIoU) Human parsing (mIoU) Saliency (maxF) Normals (mErr) Boundary (odsF)
TaskDiffusion (ViT-L) PASCAL-Context Link (Extraction code: j9u5) - 81.21 69.62 84.94 13.55 74.89
TaskDiffusion /w MLoRE (ViT-L) PASCAL-Context Link (Extraction code: gwhp) - 81.58 71.30 85.05 13.43 76.07
TaskDiffusion (ViT-B) PASCAL-Context Link (Extraction code: xidm) - 78.83 67.40 85.31 13.38 74.68
TaskDiffusion (ViT-L) NYUD-v2 Link (Extraction code: ngfp) 0.5020 55.65 - - 18.43 78.64
TaskDiffusion /w MLoRE (ViT-L) NYUD-v2 Link (Extraction code: fx2m) 0.5033 56.66 - - 18.13 78.89

Infer with the pre-trained models

To evaluate the pre-trained models, you can change the --trained_model MODEL_PATH in infer.sh to load the specified model.

Cite

If you find our work helpful, please cite: BibTex:

@inproceedings{yangmulti,
  title={Multi-Task Dense Predictions via Unleashing the Power of Diffusion},
  author={Yang, Yuqi and Jiang, Peng-Tao and Hou, Qibin and Zhang, Hao and Chen, Jinwei and Li, Bo},
  booktitle={The Thirteenth International Conference on Learning Representations}
}

Contact

If you have any questions, please feel free to contact Me(yangyq2000 AT mail DOT nankai DOT edu DOT cn).

Acknowledgement

This repository is built upon the nice framework provided by TaskPrompter and InvPT.

About

Project Page for "Multi-Task Dense Predictions via Unleashing the Power of Diffusion"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published