Buku - How to bookmark using the command line like a boss

Table of Contents

Intro

Buku is an awesome command line bookmarks manager. Let’s face it: gui’s, especially for bookmarking, are a joke. Clicking with the mouse? Laughable. Difficult to search, filter, merge, sync, extract, etc. Gui’s are second class almost everytime when it comes to working with plain text.

In this post we’re going to make buku even more awesome by:

Using fzf to fuzzy find existing tags, and add them easily to new bookmarks with a few key strokes. This helps prevent adding tags that are similar to existing tags and also makes it easy to do bookmarking with termux on android.
Remove the buku auto tagging feature which was introduced some time ago, and I hate it. It’s not more convenient, it just adds clutter.

Buku easy tagging with fzf

#!/usr/bin/env sh

tagsFile=/tmp/tags.txt

if [ $# -ne 1 ]; then
    printf "Usage: $0 [url]\n"
    exit
fi

if [ ! -f "$tagsFile" ]; then
    touch "$tagsFile"
else
    truncate -s 0 "$tagsFile"
fi

listTags=$(buku --np --stag | 
    sed 's/^\s.*[0-9].*\. \(.*\) (.*$/\1/')
printf %s "$listTags" | 
    fzf --bind "enter:execute([[ -n {} ]] && printf {}, >> '$tagsFile' || printf {q}, >> '$tagsFile'),enter:+clear-query"

buku -a "$1" "$(cat "$tagsFile")"

How it works

Fzf can be somewhat cryptic with the way it works. I need it to repeatedly allow selecting from the list of tags, while also allowing new tags to be added. I wasn’t able to get fzf to output tags to stdout within the script, so instead I saved them to a temporary file that gets cleared on each run.

We first bind the enter key to the execute action which checks if the query is empty. Then printf {} where {} is the query into the tags file. Else, add the non matching query (a new tag), using printf {q} to the tags file.

We can use additional actions with the same bind key by adding a + to clear-query to clear the input after every entry.

Sed and regex

sed 's/^\s.*[0-9].*\. $.*$ (.*$/\1/'

As for the sed command and what it’s doing, I’ll give a quick run down. For anyone that doesn’t know regular expressions, learn them, they’re amazing.

I use the buku command to list all tags (stdout, no prompt), which outputs tags in the following format:

[index] [tag] [count]
  1. tag1 (10)
  2. tag2 (2)
  3. tag3 (1)

We need to extract only the tag itself:

's/^\s.*[0-9].*\. $.*$ (.*$/\1/'

Substitue beginning of the line, optional set of spaces - s/^\s.*
Optional set of numbers - [0-9].*
Everything up to [dot][space] - .*\.
Remember this (the tag) up until [space]( - $.*$ (
Everything until the end - .*$
Finally, \1 replaces everything with the tag which we remembered - /\1/

Improvements

We can use sqlite to query the buku db directly which is faster than using buku itself. But that’s for another time.

Patching Buku to remove auto tagging

Buku is a single file python script in something like 4k lines. A while ago auto tagging was added which fetches tags from the webpage and is often just spammed keywords that are useless. I started a search for the tags keyword and found something interesting in the add_rec function. A boolean option tags_fetch that was set to true. I changed it to false, ran the script and of course, the option does nothing (maintainer said they won’t add option to disable auto tagging, maybe it’s in the works??). Within the function I found more “tags” keywords, one being tags_fetched. Specifically this line:

tags = taglist_str((tags_in or '') + DELIM + tags_fetched,

I removed + tags_fetched and Boom! It works! No more auto fetching tags and cluttering up my search results!

This may break some obscure feature, but adding, deleting, searching, most things I tried work.

Patch if you want on the latest (as of this post) git

diff --git a/buku b/buku
index 2714636..9cf9942 100755
--- a/buku
+++ b/buku
@@ -856,7 +856,7 @@ class BukuDb:
         # Fix up tags, if broken
         tags_exclude = set(taglist((tags_except or '').split(DELIM)))
         tags_fetched = result.tags(keywords=tags_fetch, redirect=tag_redirect, error=tag_error)
-        tags = taglist_str((tags_in or '') + DELIM + tags_fetched,
+        tags = taglist_str((tags_in or '') + DELIM,
                            lambda ss: [s for s in ss if s not in tags_exclude])
         LOGDBG('tags: [%s]', tags)

The latest release (4.9) is only slightly different:

--- ../buku     2024-09-29 13:23:55.095078025 -0500
+++ buku        2024-09-29 13:25:11.738714060 -0500
@@ -765,7 +765,7 @@
         title = (title_in if title_in is not None else result.title)

         # Fix up tags, if broken
-        tags_in = taglist_str((tags_in or '') + DELIM + result.tags(redirect=tag_redirect, error=tag_error))
+        tags_in = taglist_str((tags_in or '') + DELIM)

         # Process description
         desc = (desc if desc is not None else result.desc) or ''

Final thoughts

How about adding this script in a tmux popup? Or maybe just a terminal popup in a window manager at the press of a hotkey? I always have a scratchpad running tmux. A hotkey brings it to front, launches the bukuadd script and it’s over and out!

This is also a great way of bookmarking either locally or remote on android with termux . By sharing the url to termux, we can have this script launch, easily enter any tags we’d like and move on. A simple way to take your bookmarks with you everywhere.

I’ll be going over termux more in another post covering a script that will easily add bookmarks, clone git repo’s, download videos or audio from youtube, archive forum posts / news articles or anything you can think of with just a few key presses.