Use yt-dlp to download video and upload to the Internet Archive with metadata.
tubeup
uses yt-dlp to download a Youtube video (or any other provider supported by yt-dlp), and then uploads it with all metadata to the Internet Archive using the python module internetarchive.
It was designed by the Bibliotheca Anonoma to archive single videos, playlists (see warning below about more than video uploads) or accounts to the Internet Archive.
This script strongly recommends Linux or some sort of POSIX system (such as macOS), preferably from a rented VPS and not your personal machine or phone.
Reccomended system specifications:
pipx
installedsymlink
it to something larger.ffmpeg
, pip3 (typically python3-pipx
or in Arch python-pipx
), and git.For Debian/Ubuntu:
sudo apt install ffmpeg python3-pipx git
Then run:
pipx ensurepath
pipx install tubeup --include-deps
If you don't already have an Internet Archive account, register for one to give the script upload privileges.
Configure internetarchive
with your Internet Archive account.
ia configure
You will be prompted for your login credentials for the Internet Archive account you use.
Once configured to upload, you're ready to go.
tubeup <url>
Each archived video gets its own Archive.org item. Check out what you've uploaded at
http://archive.org/details/@YOURUSERNAME
.
Perodically before running, upgrade tubeup
and its dependencies by running:
pipx upgrade-all
Dockerized tubeup is provided by etnguyen03/docker-tubeup. Instructions are provided.
sudo apt update ; sudo apt upgrade
pipx
and ffmpeg
.internetarchive
for your Archive.org account.Usage:
tubeup <url>... [--username <user>] [--password <pass>]
[--metadata=<key:value>...]
[--cookies=<filename>]
[--proxy <prox>]
[--quiet] [--debug]
[--use-download-archive]
[--output <output>]
[--ignore-existing-item]
tubeup -h | --help
tubeup --version
Arguments:
<url> yt-dlp compatible URL to download.
Check yt-dlp documentation for a list
of compatible websites.
--metadata=<key:value> Custom metadata to add to the archive.org
item.
Options:
-h --help Show this screen.
-p --proxy <prox> Use a proxy while uploading.
-u --username <user> Provide a username, for sites like Nico Nico Douga.
-p --password <pass> Provide a password, for sites like Nico Nico Douga.
-a --use-download-archive Record the video url to the download archive.
This will download only videos not listed in
the archive file. Record the IDs of all
downloaded videos in it.
-q --quiet Just print errors.
-d --debug Print all logs to stdout.
-o --output <output> yt-dlp output template.
-i --ignore-existing-item Don't check if an item already exists on archive.org
You can specify custom metadata with the --metadata
flag.
For example, this script will upload your video to the Community Video collection by default.
You can specify a different collection with the --metadata
flag:
tubeup --metadata=collection:opensource_audio <url>
Any arbitrary metadata can be added to the item, with a few exceptions. You can learn more about archive.org metadata here.
Archive.org users can upload to four open collections:
opensource_audio
.opensource_software
.opensource
.opensource_movies
.Note that care should be taken when uploading entire channels. Read the appropriate section in this guide for creating collections, and contact the collections staff if you're uploading a channel or multiple channels on one subject (gaming or horticulture for example). Internet Archive collections staff will either create a collection for you or merge any uploaded items based on the YouTube uploader name that are already up into a new collection.
Dumping entire channels into Community Video is abusive and may get your account locked. Talk to the Internet Archive admins first before doing large uploads; it's better to ask for guidence or help first than run afoul of the rules.
If you do not own a collection you will need to be added as an admin for that collection if you want to upload to it. Talk to the collection owner or staff if you need assistance with this.
yt-dlp cannot do simultaneous downloads, cannot prioritize live video first on Youtube over live chat, This couldn't be fixed unless for YT which is what most people use it for, except by disabling livechat ripping to start video ripping, but even if that solution was acceptable by building in a flag on our end that disables chats to get video (again unacceptable) thats canceled by the next problem....
yt-dlp has a unacceptably high failure rate with --live-from-start
is called, sometimes the result doesn't mux, and in Twitches case is incomplete and isn't supported by all extractors. This flag is actually considered experimental by yt-dlp maintainers and has been said is unsuitable for archival purposes.
Do not use Tubeup to archive live Youtube (or any other site) video. We will not/cannot fix it, it's not even our problem, and any solutions are unpalitable since they involve more code complexity to be maintained ontop of having to disable livechat for one extractor only for live video.
yt-dlp
/internetarchive
library calls, cleansing item output, subtitles collection, and numerous small improvements over time.Copyright (C) 2024 Bibliotheca Anonoma
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.