FunASR APP Save Abandoned

Applications based on speech related models from FunASR (Modelscope).

Project README

FunASR-APP

FunASR-APP is a comprehensive speech application toolkit designed to facilitate the application and integration of FunASR's open-source speech models. Its primary goal is to package the models into convenient application packages, enabling easy application and seamless integration.

ClipVideo

As the first application toolkit of FunASR-APP, ClipVideo enables users to clip .mp4 video files or .wav audio files with chosen text segments out of the recognition results generated by Paraformer-long model.

Under the help of ClipVideo you can get the video clips easily with the following steps (in Gradio service):

  • Step1: Upload your video file (or try the example videos below)
  • Step2: Copy the text segments you need to 'Text to Clip'
  • Step3: Adjust subtitle settings (if needed)
  • Step4: Click 'Clip' or 'Clip and Generate Subtitles'

Usage

git clone https://github.com/alibaba-damo-academy/FunASR-APP.git
cd FunASR-APP
# install modelscope
pip install "modelscope[audio_asr]" -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html
# python environments
pip install -r ClipVideo/requirments.txt

(Optional) If you want to clip video file with embedded subtitles

  1. ffmpeg and imagemagick is required
  • On Ubuntu
apt-get -y update && apt-get -y install ffmpeg imagemagick
sed -i 's/none/read,write/g' /etc/ImageMagick-6/policy.xml
  • On MacOS
brew install imagemagick
sed -i 's/none/read,write/g' /usr/local/Cellar/imagemagick/7.1.1-8_1/etc/ImageMagick-7/policy.xml 
  1. Download font file to ClipVideo/font
wget https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ClipVideo/STHeitiMedium.ttc -O ClipVideo/font/STHeitiMedium.ttc

Experience ClipVideo in Modelscope

You can try ClipVideo in modelscope space: link.

Use ClipVideo as Gradio Service

You can establish your own ClipVideo service which is same as Modelscope Space as follow:

python clipvideo/gradio_service.py

then visit localhost:7860 you will get a Gradio service like below and you can use ClipVideo following the steps:

Use ClipVideo in command line

ClipVideo supports you to recognize and clip with commands:

# working in ClipVideo/
# step1: Recognize
python clipvideo/videoclipper.py --stage 1 \
                       --file examples/2022云栖大会_片段.mp4 \
                       --output_dir ./output
# now you can find recognition results and entire SRT file in ./output/
# step2: Clip
python clipvideo/videoclipper.py --stage 2 \
                       --file examples/2022云栖大会_片段.mp4 \
                       --output_dir ./output \
                       --dest_text '我们把它跟乡村振兴去结合起来,利用我们的设计的能力' \
                       --start_ost 0 \
                       --end_ost 100 \
                       --output_file './output/res.mp4'

FunASR hopes to build a bridge between academic research and industrial applications on speech recognition. By supporting the training & finetuning of the industrial-grade speech recognition model released on ModelScope, researchers and developers can conduct research and production of speech recognition models more conveniently, and promote the development of speech recognition ecology. ASR for Fun!

📚FunASR Paper: 🌟Support FunASR:

Open Source Agenda is not affiliated with "FunASR APP" Project. README Source: alibaba-damo-academy/FunASR-APP
Stars
25
Open Issues
1
Last Commit
11 months ago
License
MIT

Open Source Agenda Badge

Open Source Agenda Rating