WIP tag-based file organizer & search
I am currently running a read-only demonstration copy of Etiquette at https://etiquette.voussoir.net where you can browse around.
Etiquette is a tag-based file organization system with a web interface, built with Flask and SQLite3. Tag-based systems solve problems that a traditional folder hierarchy can't: which folder should a file go in if it equally belongs in both? and how do I make my files searchable without littering the filenames themselves with keywords?
Etiquette is unique because the tags themselves are hierarchical. By tagging one of your vacation photos with the family.parents.dad
tag, it will automatically appear in searches for family.parents
and family
as well. A traditional folder system, here called albums, is available to bundle files that always belong together without creating a bespoke tag to represent that bundle. Regardless, the files on disk are never modified.
As you'll see below, Etiquette has a core backend package and multiple frontends that use it. These frontend applications will use import etiquette
to access the backend code. Therefore, the etiquette
package needs to be in the right place for Python to find it for import
.
Run pip install -r requirements.txt --upgrade
.
Make a new folder somewhere on your computer, and add this folder to your PYTHONPATH
environment variable. For example, I might use D:\pythonpath
or ~/pythonpath
. Close and re-open your Command Prompt / Terminal so it reloads the environment variables.
Add a symlink to the etiquette folder into that folder:
The repository you are looking at now is D:\Git\Etiquette
or ~/Git/Etiquette
. You can see the folder called etiquette
.
Windows: mklink /d fakepath realpath
for example mklink /d "D:\pythonpath\etiquette" "D:\Git\Etiquette\etiquette"
Linux: ln --symbolic realpath fakepath
for example ln --symbolic "~/Git/Etiquette/etiquette" "~/pythonpath/etiquette"
Run python -c "import etiquette; print(etiquette)"
to confirm.
In order to prevent the accidental creation of Etiquette databases, you must first use etiquette_cli.py init
to create your database.
cd
to the folder where you'd like to create the Etiquette database.
Run python frontends/etiquette_cli.py --help
to learn about the available commands.
Run python frontends/etiquette_cli.py init
to create a database in the current directory.
Note: Do not cd
into the frontends folder. Stay in the folder that contains your _etiquette
database and specify the full path of the frontend launcher. For example:
Windows:
D:\somewhere> python D:\Git\Etiquette\frontends\etiquette_cli.py
Linux:
/somewhere $ python /Git/Etiquette/frontends/etiquette_cli.py
It is expected that you create a shortcut file or launch script so you don't have to type the whole filepath every time.
Use etiquette_cli init
to create the database in the desired directory.
Run python frontends/etiquette_flask/etiquette_flask_dev.py [port]
to launch the flask server. Port defaults to 5000 if not provided.
Open your web browser to localhost:<port>
.
Note: Do not cd
into the frontends folder. Stay in the folder that contains your _etiquette
database and specify the full path of the frontend launcher. For example:
Windows:
D:\somewhere> python D:\Git\Etiquette\frontends\etiquette_flask\etiquette_flask_dev.py 5001
Linux:
/somewhere $ python /Git/Etiquette/frontends/etiquette_flask/etiquette_flask_dev.py 5001
Add --help
to learn the arguments.
It is expected that you create a shortcut file or launch script so you don't have to type the whole filepath every time.
You already know that the frontend code imports the backend code. But now, gunicorn needs to import the frontend code.
Use etiquette_cli init
to create the database in the desired directory.
Add a symlink to the frontends/etiquette_flask
folder into the folder you added to your PYTHONPATH
earlier.
ln --symbolic realpath fakepath
for example ln --symbolic "~/Git/Etiquette/frontends/etiquette_flask" "~/pythonpath/etiquette_flask"
Add a symlink to frontends/etiquette_flask/etiquette_flask_prod.py
into the folder you added to your PYTHONPATH
, or into the folder from which you will run gunicorn.
ln --symbolic realpath fakepath
for example ln --symbolic "~/Git/Etiquette/frontends/etiquette_flask/etiquette_flask_prod.py" "~/pythonpath/etiquette_flask_prod.py"
or
ln --symbolic "~/Git/Etiquette/frontends/etiquette_flask/etiquette_flask_prod.py" "./etiquette_flask_prod.py"
where ./
is the location from which you will run gunicorn.
If you are using a proxy like NGINX, make sure you are setting X-Forwarded-For so that Etiquette sees the user's real IP, and not the proxy's own (127.0.0.1) IP. For example:
location / {
...
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
...
}
To run non-daemonized, on a specific port, with logging to the terminal, I use:
~/cmd/python ~/cmd/gunicorn_py etiquette_flask_prod:site --bind "0.0.0.0:6667" --access-logfile "-" --access-logformat "%(h)s | %(t)s | %(r)s | %(s)s %(b)s"
It is expected that you create a shortcut file or launch script so you don't have to type the whole filepath every time.
Use etiquette_cli init
to create the database in the desired directory.
Run python frontends/etiquette_repl.py
to launch the Python interpreter with the PhotoDB pre-loaded into a variable called P
. Try things like P.new_photo
or P.digest_directory
.
Note: Do not cd
into the frontends folder. Stay in the folder that contains your _etiquette
database and specify the full path of the frontend launcher. For example:
Windows:
D:\somewhere> python D:\Git\Etiquette\frontends\etiquette_repl.py
Linux:
/somewhere $ python /Git/Etiquette/frontends/etiquette_repl.py
It is expected that you create a shortcut file or launch script so you don't have to type the whole filepath every time.
Let's say you store your photos in D:\Documents\Photos
, and you want to tag the files with Etiquette. You can get started with these steps:
cd
to that location. cd D:\Documents\Photos
is probably fine.etiquette_cli.py init
to create the database. A folder called _etiquette
will appear.etiquette_cli.py digest . --ratelimit 1 --glob-filenames *.jpg
to add the files into the database. You can use etiquette_cli.py digest --help
to learn about this command.etiquette_flask_dev.py 5000
to start the webserver on port 5000.localhost:5000
and begin browsing.When adding new files to the database or reloading their metadata, Etiquette will create SHA256 hashes of the files. If you are using Etiquette to organize large media files, this may take a while. I was hesitant to add hashing and incur this slowdown, but the hashes greatly improve Etiquette's ability to detect when a file has been renamed or moved, which is important when you have invested your valuable time into adding tags to them. I hope that the hash time is perceived as a worthwhile tradeoff.
I highly recommend storing batch/bash scripts of your favorite etiquette_cli
invocations, so that you can quickly sync the database with the state of the disk in the future. Here are some suggestions for what you might like to include in such a script:
digest
: Storing all your digest invocations in a single file makes ingesting new files very easy. For your digests, I recommend including --ratelimit
to stop Photos from having the exact same created timestamp, and --hash-bytes-per-second
to reduce IO load. In addition, you don't want to forget your favorite --glob-filenames
patterns.reload-metadata
: In order for Etiquette's hash-based rename detection to work properly, the file hashes need to be up to date. If you're using Etiquette to track files which may be modified, you may want to get in the habit of reloading metadata regularly. By default, this will only reload metadata for files whose mtime and/or byte size have changed, so it should not be very expensive. You may add --hash-bytes-per-second
to reduce IO load.purge-deleted-files
& purge-empty-albums
: You should only do this after a digest
, because if a file has been moved / renamed you want the digest to pick up on that before purging it as a dead filepath. The Photo purge should come first, so that an album containing entirely deleted photos will be empty when it comes time for the Album purge.You may notice that Etiquette doesn't have a version number anywhere. That's because I don't think it's ready for one. I am using this project to learn and practice, and breaking changes are very common.
Here is a brief overview of the project to help you learn your way around:
etiquette
objects
photodb
frontends
etiquette_flask
etiquette_repl
etiquette_cli
utilities
Photo.merge
to combine duplicate entries.deleted
flag, to make easy restoration possible. Also consider regrouping the children of restored Groupables if those children haven't already been reassigned somewhere else.photo.get_tags()
on each one is not. In order to batch this we would have to have a separate function that fetches a whole bunch of tags and assigns them to the photo object).Here are some thoughts about the kinds of features that need to exist within the permission system. I don't know how I'll actually manage it just yet. Possibly a permissions
table in the database with user_id | permission
where permission
is some reliably-formatted string.
can_upload
)
can_tag_own
)can_tag_photo:<photo_id>
)can_tag
)can_edit_album_own
)can_edit_album:<album_id>
)can_edit_album
)can_create_tag
)can_delete_tag
)
can_delete_tag_own
)can_delete_tag_in_use
)https://git.voussoir.net/voussoir/etiquette
https://github.com/voussoir/etiquette