Once your YouTube video collection grows, it becomes hard to search and find a specific video. That's where Tube Archivist comes in: By indexing your video collection with metadata from YouTube, you can organize, search and enjoy your archived YouTube videos without hassle offline through a convenient web interface.
Take a look at the example
docker-compose.yml file provided. Use the latest or the named semantic version tag. The unstable tag is for intermediate testing and as the name implies, is unstable and not be used on your main installation but in a testing environment.
For minimal system requirements, the Tube Archivist stack needs around 2GB of available memory for a small testing setup and around 4GB of available memory for a mid to large sized installation.
Tube Archivist depends on three main components split up into separate docker containers:
The main Python application that displays and serves your video collection, built with Django.
REDIS_HOSTare needed to tell Tube Archivist where Elasticsearch and Redis respectively are located.
HOST_GIDallows Tube Archivist to
chownthe video files to the main host system user instead of the container user. Those two variables are optional, not setting them will disable that functionality. That might be needed if the underlying filesystem doesn't support
TA_HOSTto match with the system running Tube Archivist. This can be a domain like example.com, a subdomain like ta.example.com or an IP address like 192.168.1.20, add without the protocol and without the port. You can add multiple hostnames separated with a space. Any wrong configurations here will result in a
Bad Request (400)response.
TA_PASSWORDto create the initial credentials.
ELASTIC_PASSWORDis for the password for Elasticsearch. The environment variable
ELASTIC_USERis optional, should you want to change the username from the default elastic.
TZenvironment variable, defaults to UTC.
If you have a collision on port
8000, best solution is to use dockers HOST_PORT and CONTAINER_PORT distinction: To for example change the interface to port 9000 use
9000:8000 in your docker-compose file.
Should that not be an option, the Tube Archivist container takes these two additional environment variables:
Changing any of these two environment variables will change the files nginx.conf and uwsgi.ini at startup using
sed in your container.
You can configure LDAP with the following environment variables:
true) Set to anything besides empty string to use LDAP authentication instead of local user authentication.
ldap://ldap-server:389) Set to the uri of your LDAP server.
true) Set to anything besides empty string to disable certificate checking when connecting over LDAPS.
uid=search-user,ou=users,dc=your-server) DN of the user that is able to perform searches on your LDAP account.
yoursecretpassword) Password for the search user.
ou=users,dc=your-server) Search base for user filter.
(objectClass=user)) Filter for valid users. Login usernames are automatically matched using
uidand does not need to be specified in this filter.
When LDAP authentication is enabled, django passwords (e.g. the password defined in TA_PASSWORD), will not allow you to login, only the LDAP server is used.
Note: Tube Archivist depends on Elasticsearch 8.
bbilly1/tubearchivist-es to automatically get the recommended version, or use the official image with the version tag in the docker-compose file.
Stores video meta data and makes everything searchable. Also keeps track of the download queue.
Follow the documentation for additional installation details.
Functions as a cache and temporary link between the application and the file system. Used to store and display messages and configuration variables.
For some architectures it might be required to run Redis JSON on a nonstandard port. To for example change the Redis port to 6380, set the following values:
REDIS_PORT=6380to the tubearchivist service.
command: --port 6380 --loadmodule /usr/lib/redis/modules/rejson.so
You will see the current version number of Tube Archivist in the footer of the interface so you can compare it with the latest release to make sure you are running the latest and greatest.
bbilly1/tubearchivist-esto automatically get the recommended version.
bbilly1/rejson, an unofficial rebuild for arm64.
Elastic Search in Docker requires the kernel setting of the host machine
vm.max_map_count to be set to at least 262144.
To temporary set the value run:
sudo sysctl -w vm.max_map_count=262144
To apply the change permanently depends on your host operating system:
vm.max_map_count = 262144to the file /etc/sysctl.conf.
vm.max_map_count = 262144.
If you see a message similar to
failed to obtain node locks, tried [/usr/share/elasticsearch/data] and
maybe these locations are not writable when initially starting elasticsearch, that probably means the container is not allowed to write files to the volume.
To fix that issue, shutdown the container and on your host machine run:
chown 1000:0 -R /path/to/mount/point
This will match the permissions with the UID and GID of elasticsearch process within the container and should fix the issue.
The Elasticsearch index will turn to read only if the disk usage of the container goes above 95% until the usage drops below 90% again, you will see error messages like
disk usage exceeded flood-stage watermark, link.
Similar to that, TubeArchivist will become all sorts of messed up when running out of disk space. There are some error messages in the logs when that happens, but it's best to make sure to have enough disk space before starting to download.
We have come far, nonetheless we are not short of ideas on how to improve and extend this project. Issues waiting for you to be tackled in no particular order:
The best donation to Tube Archivist is your time, take a look at the contribution page to get started.
Second best way to support the development is to provide for caffeinated beverages:
Big thank you to Digitalocean for generously donating credit for the tubearchivist.com VPS and buildserver.