DcaseNet Save

Author's repository for reproducing DcaseNet, an integrated pre-trained DNN that performs acoustic scene classification, audio tagging, and sound event detection. Implemented using PyTorch.

Project README

Overview

This GitHub project includes PyTorch implementation for reproducing experiments and DNN models used in the paper DcaseNet: An integrated pretrained deep neural network for detecting and classifying acoustic scenes and events, accepted for presentation at IEEE ICASSP 2021.

DcaseNet is a DNN which jointly performs acoustic scene classification (ASC), audio tagging (TAG), and sound event detection (SED) simultaneously. It adopts a two-phase training. In the first phase, joint training of three tasks is performed. Then, the model is fine-tuned for each task.

Usage

Environment Setting

We used Nvidia GPU Cloud for conducting our experiments. The training was done using one Nvidia Titan RTX GPU. Our settings are available at launch_nvidia-gpu-cloud.sh

Train

Download three datasets: DCASE 2020 challenge Task 1-a, DCASE 2019 challenge Task 2, and DCASE 2020 challenge Task 3 and configure directories.
(selectively) Enter virtual environment using NGC.
Set parameters in train.sh
run train.sh

If you prefer to use pre-trained joint DcaseNet and fine-tune only, remove 'Joint' experiment on train.sh and copy Joint weights into your 'save_dir'

Evaluation

Download three datasets: DCASE 2020 challenge Task 1-a, DCASE 2019 challenge Task 2, and DCASE 2020 challenge Task 3 and configure directories.
Set parameters in evaluate_trained_models.sh
Run evaluate_trained_models.sh

Windows

There's a simple GUI program in DCASENetShellScriptBuilder that generates a script that one can run on Windows OS. After configuring a few checkboxes and setting directories for datasets, the generated script trains and evaluates. This program is provided by yeongsoo, and no further maintenance will be done.

The program has three rows: (i) On which tasks will the user conduct joint training (By checking none, it will use pretrained DcaseNet using all three tasks) (ii) On which tasks to perform fine-tuning (checking more than one task will train separate DcaseNets for each fine-tune task) (recommended to should check at least on task) (iii) On which tasks to perform the evaluation (recommended to be the same with upper row)

Below, there are text boxes where one can set directories of the downloaded datasets and save trained models. Note that when setting dataset directories, the code in this repo expects the folder that comes out after unzipping it.

DCASENetShellScriptBuilder

Email [email protected] for other details :-).

BibTex

This repository provides the code for reproducing the below paper.

@inproceedings{jung2021dcasenet,
  title={DCASENet: An integrated pretrained deep neural network for detecting and classifying acoustic scenes and events},
  author={Jung, Jee-weon and Shim, Hye-jin and Kim, Ju-ho and Yu, Ha-Jin},
  booktitle={Proc. ICASSP},
  pages={621--625},
  year={2021},
  organization={IEEE}
}

TO-DO

Log

2020.09.24. : Initial commit
2020.10.18. : Overall validation & refactoring (thanks to yeongsoo)
2020.11.04. : Added filetrees & Refactoring finish

Open Source Agenda is not affiliated with "DcaseNet" Project. README Source: Jungjee/DcaseNet

Stars

Open Issues

Last Commit

2 years ago

Repository

Jungjee/DcaseNet

License

MIT

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/dcasenet"><img src="https://www.opensourceagenda.com/projects/dcasenet/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022