Tagging files and folders using hashtags and symlinks

November 18, 2012 - 4 min read
Old and archived

This post is old and archived but here for historical reasons. The taggo project is now for python 3 only, and there have been a rewrite. I do somewhat still (in 2021) update it from time to time. It still solves a real problem for me.

There is lots of tools out there that let you organise files (specially your picture archive). However, they are all depending on some sort of database, one master computer to add the tags from, and you can't browse the organised files in their organised structure from all devices.

I made this project because I had this exact problem organising my own pictures. I wanted something which:

  • You could tag pictures as close to where you look at them (IE, the file browser itself)
  • Is platform independent.
    • Like real platform independent! I wanted to browse this tags on my TV!
  • Not another thing to backup. I am already backing up the pictures themselves.
  • Support organising whole folders, not just single files.

There is probably many more than myself that is annoyed by this problem, therefor I will share my solution, which is a python script that goes trough all the files in a directory, looks at the filenames and looks for hashtags. This is put into my storage NAS's crontab and runs every hour.

Example

File structure

xeor@omi { ~/Documents/my_pictures }$ find .
.
./2012 #Business trip to #USA
./2012 #Business trip to #USA/dcim0123 #People-Lars.jpg
./2012 #Business trip to #USA/dcim0124.jpg
./2012 #Business trip to #USA/dcim0125.jpg
./2012 #Business trip to #USA/dcim0126 #Conference.jpg

Running taggo

taggo run_once

Tags created

xeor@omi { ~/Documents/tags }$ find . 
.
./Business
./Business/root - 2012 #Business trip to #USA
./Conference
./Conference/2012 #Business trip to #USA - dcim0126 #Conference.jpg
./People
./People/Lars
./People/Lars/2012 #Business trip to #USA - dcim0123 #People-Lars.jpg
./USA
./USA/root - 2012 #Business trip to #USA

Explaination

As you can see on the file structure, we created one folder and 4 files. The folder itself 2012 #Business trip to #USA have two tags, #Business and #USA (as you probably already knew :) ) The dcim0123 file have a tag like #People-Lars, which means that taggo should threat it as a sub tag.

The list of tags created is now just a bunch of symlinks to the original files. ./USA/root - 2012 #Business trip to #USA is a link to the folder called 2012 #Business trip to #USA, the same with ./Business/root - 2012 #Business trip to #USA. For our sub tag, you can see that it is in the directory People/Lars; ./People/Lars/2012 #Business trip to #USA - dcim0123 #People-Lars.jpg.

Configuration

In the file called taggo.cfg you can define stuff like tag indicator (the hashtag), sub tag separator, what filename the symlinked filenames should get (default is %(rel_folders)s - %(basename)s), what to replace / with in tag filenames, content folder and tag folder.

Taggo will automatically create the taggo.cfg file when you run it the first time. (Just do a ./taggo)

Usage

Using taggo is simple, just put it in any directory and put something like 22 * * /usr/bin/python /path/to/taggo run_once in the crontab. It will make sure that new symlinks is created.

If you rename a file, the symlink will die. But when you use the run_once parameter, it will automatically delete the invalid symlinks. I have been very careful when creating the delete function. It will only delete symlinks where the paths they point to does not exists. and to delete the empty directories, we are using os.rmdir, which is a python function that is made to delete empty directories only.

To find and use the project, check out the Github link at the top of this article.