Home

Systemd instead of Cron or Simple Queue

Publication date
Estimated reading time
7 min read

My site dis­plays run­ning data from Strava’s API. Like any static site with dynamic data, I need to fetch updates and rebuild peri­od­i­cally. Here’s how that require­ment led me to TIL of sys­temd fea­tures I didn’t know existed.

The Prob­lem

My site shows my run­ning stats, all pulled from Strava’s API. I have a script (let’s call it /usr/local/bin/strava_download) that fetches the data, and since I’m using a Static Site Gen­er­a­tor, I need to rebuild the site (let’s call that one /usr/local/bin/build_site) after each update.

The chal­lenge: keep­ing the data fresh with­out being waste­ful. Rebuild too often and I’m burn­ing CPU for noth­ing. Rebuild too rarely and my stats are stale.

First Try: Cron

My ini­tial instinct was to reach for cron:

0 * * * * /usr/local/bin/strava_download && /usr/local/bin/build_site

Hourly updates. Works fine, tech­ni­cally accept­able, but it felt crude. Too brit­tle, too waste­ful, and not very ele­gant.

Second Try: Sys­temd Timers

Turns out sys­temd can do the same thing, but with better con­trol and mon­i­tor­ing capa­bil­i­ties.

Cre­ated a ser­vice unit:

sudo nvim /etc/systemd/system/strava-site.service
[Unit]
Description=Run Strava sync and site deploy
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
User=fetsh
# Before starting, systemd creates /run/strava-site with the given mode, owned by User=. Removed when the service stops. We use this for the lock file.
RuntimeDirectory=strava-site
RuntimeDirectoryMode=0755

# Single, non-blocking lock for the entire sequence. flock -n exits with failure if the lock is already held, immediately (no wait). This prevents overlap if the job is triggered again while running.
ExecStart=/usr/bin/flock -n /run/strava-site/lock \
  /bin/bash -c '/usr/local/bin/strava_download && /usr/local/bin/build_site'

Nice=10
IOSchedulingClass=best-effort
IOSchedulingPriority=7
PrivateTmp=true
ProtectSystem=full
NoNewPrivileges=true

[Install]
WantedBy=multi-user.target

And a timer unit:

sudo nvim /etc/systemd/system/strava-site.timer

For wall-clock sched­ul­ing:

[Unit]
Description=Hourly trigger for our site

[Timer]
OnCalendar=hourly
Persistent=true
AccuracySec=5m
RandomizedDelaySec=60

[Install]
WantedBy=timers.target

Or for rel­a­tive sched­ul­ing (one hour after each suc­cess­ful run):

[Unit]
Description=Hourly trigger for our site

[Timer]
OnBootSec=10min
OnUnitActiveSec=1h
Persistent=true
Unit=strava-site.service

[Install]
WantedBy=timers.target

Acti­vate it:

sudo systemctl daemon-reload
sudo systemctl enable --now strava-site.timer

This was better than cron, but still fun­da­men­tally flawed. I was rebuild­ing my site 24 times a day when I typ­i­cally only run once. Plus, Strava explic­itly rec­om­mends using web­hooks instead of polling their API.

Third Try: Sys­temd as a Mes­sage Queue

Here’s where it gets inter­est­ing. Strava web­hooks expect an imme­di­ate response, with actual pro­cess­ing hap­pen­ing asyn­chro­nously. Nor­mally, this means set­ting up some queue­ing system, which in my case is usu­ally Sidekiq (with Redis). But I didn’t want to set up all that infra­struc­ture for such a simple task, so what if sys­temd could be this queue?

I wrote a small Roda appli­ca­tion to receive web­hooks and save each update as a JSON file in /var/lib/strava/webhook_flags. Then I let sys­temd handle the rest.

The Path Unit (Watcher)

sudo nvim /etc/systemd/system/strava-webhook-update.path
[Unit]
Description=Trigger Strava webhook drain when new event files exist

[Path]
# Fires when JSON files appear, re-fires on changes
PathExistsGlob=/var/lib/strava/webhook_flags/*-update.json
Unit=strava-webhook-update.service

[Install]
WantedBy=multi-user.target

The Service Unit (Processor)

This han­dles the actual pro­cess­ing:

sudo nvim /etc/systemd/system/strava-webhook-update.service
[Unit]
Description=Start Strava site pipeline on update flag

[Service]
Type=oneshot

ExecStartPre=/usr/bin/install -d -m 0755 -o fetsh -g strava-data /var/lib/strava/webhook_flags_done
# Start the main job (protected by flock in the existing unit)
ExecStart=/bin/systemctl start strava-site.service
# After successful start, move processed flags to archive
ExecStartPost=/bin/bash -c 'set -euo pipefail; shopt -s nullglob; for f in /var/lib/strava/webhook_flags/*-update.json; do mv "$f" /var/lib/strava/webhook_flags_done/; done'

The cur­rent setup moves web­hook files to the „done” folder regard­less of whether strava-site.service suc­ceeds. For my use case, this is accept­able.

Activation and Cleanup

Enable the web­hook system:

sudo systemctl daemon-reload
sudo systemctl enable --now strava-webhook-update.path

Kill the old timer:

sudo systemctl stop strava-site.timer
sudo systemctl disable strava-site.timer
sudo systemctl daemon-reload

Room for Improve­ment

Right now my strava-webhook-update.service just calls strava-site.service because that ser­vice already existed. Tech­ni­cally, it isn’t really a queue — it’s a trig­ger + worker setup. The .path unit only wakes up a ser­vice when a new file appears, but doesn’t guar­an­tee reli­able deliv­ery, order­ing, or acknowl­edg­ment after suc­cess.

To make it behave like a real queue we need to com­bine these ser­vices into a single ser­vice, which itself should take a lock, drain all pend­ing files, process them, and only after suc­cess­ful com­ple­tion move them to a „done” direc­tory. That way, fail­ures leave unprocessed mes­sages in place for the next run — proper at-least-once deliv­ery.

Also, PathExistsGlob might not be the best direc­tive for the path unit. Maybe DirectoryNotEmpty would be more appro­pri­ate.

What I’ve learned

  1. Systemd timers are more powerful than cron with better integration and monitoring
  2. Path units can watch for filesystem changes, effectively turning directories into message queues
  3. Flock provides simple but effective concurrency control

I don’t think it is some­thing for high-volume or crit­i­cal sys­tems, of course, but for a per­sonal site updat­ing run­ning stats systemd is more than enough because it’s reli­able, observ­able (jour­nalctl, sys­tem­ctl status), and doesn’t require extra dae­mons or depen­den­cies.