Tagem Save

A broad family of utilities for organising files based on hierarchical tagging, from web server to a computer vision dataset creation pipeline.

Project README

tagem

screenshot

Description

A single page application, with associated command-line utilities, for the rapid categorising and accessing of files, based on assignable attributes such as (heirarchical) tags, named variables, file sizes, hashes, and audio duration.

Features

  • Supports most common file formats
  • Hashing of local files.
    • Hashes include MD5, SHA256, and DCT (visual hashing of images and video).
    • These hashes can be used in qry to facilitate fast manual de-duplication.
    • Hashing of remote files is planned.
  • Text editor
    • More of a text creator atm, as editing existing files is currently restricted.
  • Ordering, filtering etc. of results in the tables on the page.
  • qry: A simple query language that allows for short and human-friendly queries that automatically translate to complex SQL queries
    • Combine ANDs and ORs (intersections and unions) of many different filters (for attributes like size, views, likes, tags; hashes in common with other files; etc).
    • It can search for all types of things, not just files but also the tags themselves.
    • See the full documentation.
  • Heirarchical tags
    • Any tag can have any number of parent tags and any number of child tags.
  • Everything can be tagged
    • Eras, files, directories, devices, and even tags themselves (as parent tags)
    • For instance, the directory https://www.youtube.com/watch?v- could be tagged Video, and that tag will be applied to all files within.
  • Support for remote files
    • Remote files are as accessible as local files (except for some sites that tell the browser not to display them within iframes - though there's a relatively simple workaround for that).
    • You can add files from the server's attached storage devices, and also from remote websites (including an option for downloading with youtube-dl). Local copies of remote files are treated as backups, and are listed on the remote file's page.
    • With the view filesystem option, this means that - provided the server has access to a script written for the specific website - a website's contents could be easily viewable in the table view.
  • Eras
    • Tagged time intervals of audio and video files.
    • These can be searched for, and used in playlists interchangeably with files themselves.
    • Eras can be downloaded (from a local file or remote URL) into their own file.
      • NOTE: This currently requires ffmpeg to be installed alongside this server. However, it will eventually be combined into the server itself.
  • Playlists
    • Playlists can be created on the fly out of any selection of files and/or eras (in any combination).
  • Support for other databases
    • Files can be associated with posts from other databases, so long as those databases follow a strict structure.
    • For instance, a Reddit post could be scraped, and associated with the URL of the linked article, as here
    • Each external database can, if it includes the necessary tables, display a lot more information than just the comments under a post, even listing all the posts (translated to our files) that a single user has commented on.
    • An example script for scraping Reddit posts is included in this project
  • Tag thumbnails
    • These thumbnails are inherited from their parents, unless the child has a thumbnail of its own.
  • file2 values
    • Files can be assigned arbitrary values, currently integers and datetimes.
    • For instance, you could have a Score attribute for each user to assign to files.
  • Permissions system
    • Different users can be assigned different blocklists of tags, and will not be able to view any era/file/directory/device with such a tag, or a descendant of such a tag.
    • Different users can have different allowed actions, such as viewing files, editing tags, creating eras, assigning tags, and adding files.
    • A big caveat here is that the login system is currently only a placeholder - it does not yet even ask for a password.
  • Low footprint
    • Almost all executed JavaScript was written by hand - only one 3rd party library is loaded
    • On Firefox, each page consumes 10-15MB - comparable to a Google search results page
    • The CSS is designed to avoid unnecessarily moving parts

Demonstration

A neutered version of this app is hosted here. GitHub does not allow it to be interactive, so most features are disabled - it is basically just a demonstration of the front-end.

A sample of features in the demo:

  • The ability to create and view playlists of 'eras' - e.g. and e.g. and e.g.
  • If you "log in", you can view an example administrator dashboard.

See the user guide (linked below) for some examples of features.

User Guide

For those using the web app

See USER_GUIDE.md

Server Admin Guide

For those running the server

See ADMIN_GUIDE.md

See ADMIN_ADVANCED_USAGE.md for more complex features and use cases.

Installation

Server

See INSTALL.md.

Scripts

You'll probably want to add the scripts directory to your PATH environmental variable, or perhaps just copy the scripts to /usr/local/bin.

The Reddit userscript can be added the usual way you add userscripts.

Roadmap

See the list of officially planned features.

Contributing

I'm very happy to consider pull requests. Particularly - but definitely not limited to - front-end development.

Translations of documentation is also of course welcome, alongside bug reports, and general feedback.

Back End

See COMPILING.md, CONTRIBUTING.md, and DESIGN_DECISIONS.md.

Front End

See CONTRIBUTING.md, and FRONTEND.md.

Stats

See STATS.md for some Git contribution graphs.

Background

If you feel like there aren't enough blogs on the internet, here's another. It's a look at how this project evolved from some unlikely decisions, as I'm personally interested in how the Butterfly Effect occurs in software development.

FAQ

What files does it support?

Obviously it should support the viewing of any kind of file that your browser does. Practically, detecting the file type can be a bit of an issue - it is fully accurate for files it downloads, it's pretty good for videos and audio, but if you rename a PNG file to a JPEG there's no current way it will be able to tell it's actually a PNG.

Similar Projects

  • Etiquette
    • The most similar project, as it also features heirarchical tagging
    • Flask + SQLite app written in Python
  • Video Hub App
    • For managing video files
    • Electron app written in Typescript

License

This code is licensed only under the GPL-3 license.

Open Source Agenda is not affiliated with "Tagem" Project. README Source: NotCompsky/tagem
Stars
95
Open Issues
4
Last Commit
3 years ago
Repository
License

Open Source Agenda Badge

Open Source Agenda Rating