Skip to content

Instantly share code, notes, and snippets.

@dsl400
Created February 10, 2026 20:37
Show Gist options
  • Select an option

  • Save dsl400/19d9df3cd8a97ad22f0dd69ab783403f to your computer and use it in GitHub Desktop.

Select an option

Save dsl400/19d9df3cd8a97ad22f0dd69ab783403f to your computer and use it in GitHub Desktop.
# pg_cron worker running every second
## The Forbidden Fruit
---
```sql
SELECT cron.schedule('worker', '1 seconds', 'CALL background_worker()');
```
You type this into your editor and Uncle Bob appears behind you like a disappointed ghost. 'Every second?'"
Blasphemy. A query every second, 86,400 times a day, hammering your database into the ground. Except... it doesn't. Here's why
The typical setup feels like hiring someone to check an empty mailbox every second. They walk over, open it, find nothing, write down "nothing", walk back, and do it again. Every tick: a new connection (or a background worker wake-up), a query against `cron.job_run_details`, and a write to the job log. Multiply that by 86,400 seconds in a day and you've got a diligent employee who's done a full shift of absolutely nothing useful.
The first instinct is usually to use a trigger and skip the polling entirely. Reacting to modifications sounds way more efficient than polling every second, but it comes with some serious caveats.
```sql
CREATE OR REPLACE FUNCTION on_insert() RETURNS trigger AS $$
BEGIN
-- do work here
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg AFTER INSERT ON job_queue
FOR EACH ROW EXECUTE FUNCTION on_insert();
```
Clean. Elegant. And potentially catastrophic.
A trigger runs **inside the caller's transaction**. Whatever your trigger does, the original query has to wait for it. If your trigger takes 50ms, every query now takes 50ms longer. If your trigger calls an external service, touches another table, or, god forbid, does a `FOR UPDATE` on a hot row, you've just welded your write path to your processing path.
It gets worse. If the trigger throws an error, **the original insert rolls back**. Your application wanted to write a row to a queue. Instead it got a 500 because some downstream processing logic had a bug. The data never made it to the table.
And it gets worse still. Under load, trigger overhead compounds. Ten concurrent inserts, each waiting on trigger execution, each holding row locks longer than necessary. What was a simple append-only queue is now a contention point. I've seen well-meaning `AFTER INSERT` triggers turn a table that handled 10,000 inserts/second into one that struggled with 500.
So triggers are actually worse than the mindless mailbox worker. Now you have to debug weird locks and work around scenarios you never thought of. But there's a middle ground. A pattern that's not perfect, but surprisingly effective for what it is.
---
## What we actually want
Let's say you're building a todo app. Users mark tasks as done, and you need some background work to happen, send a notification, update a summary, sync to another service, whatever. You want it to feel snappy. Within a second of the user clicking "done", the background work should kick in.
You also want to keep things simple. No Redis. No separate worker service. No extra infrastructure to deploy and monitor. Just PostgreSQL doing its thing.
Going back to our mailbox worker, the naive pg_cron approach looks like this:
```sql
SELECT cron.schedule('worker', '1 seconds', $$
UPDATE todos SET notified = true
WHERE id IN (
SELECT id FROM todos
WHERE done = true AND notified = false
ORDER BY id LIMIT 100
FOR UPDATE SKIP LOCKED
)
$$);
```
(note: the `'N seconds'` syntax requires **pg_cron v1.5.0+**, released February 2023; the valid range is 1–59 seconds)
This works. But it's the same restless employee from before. Every second, pg_cron sends them to check. You can make their trip shorter with `cron.use_background_workers = on` (reuse a background worker instead of hailing a cab each time). You can tell them to stop filing reports with `cron.log_run = off`. But they still make the trip. Every. Single. Second. Even at 3 AM when nobody is using your app.
What you really want is an employee who sits next to the mailbox, glances at it periodically, and only stands up when there's something inside.
---
## The pattern: teach the worker to nap
Instead of pg_cron poking the worker every second, you let the worker manage itself. pg_cron launches a procedure. That procedure checks for work. If there's nothing to do, it takes a nap with `pg_sleep()`. When something shows up, it handles it and clocks out. pg_cron notices the shift is over and sends the worker back in.
```sql
CREATE OR REPLACE PROCEDURE background_worker() AS $$
DECLARE
_deadline timestamptz := clock_timestamp() + interval '5 minutes';
BEGIN
LOOP
-- Shift's over. Clock out and let pg_cron send us back in.
EXIT WHEN clock_timestamp() >= _deadline;
-- Glance at the mailbox
IF NOT EXISTS (SELECT 1 FROM todos WHERE done = true AND notified = false) THEN
-- Nothing there. Take a nap.
PERFORM pg_sleep(1);
CONTINUE;
END IF;
-- Something's here. Do the work.
UPDATE todos SET notified = true
WHERE id IN (
SELECT id FROM todos
WHERE done = true AND notified = false
ORDER BY id LIMIT 100
FOR UPDATE SKIP LOCKED
);
-- Done. Clock out.
EXIT;
END LOOP;
END;
$$ LANGUAGE plpgsql;
```
Schedule it:
```sql
SELECT cron.schedule('bg-worker', '1 seconds', 'CALL background_worker()');
```
That's it. Your worker now naps when idle and wakes up when there's something to do.
---
## Why this actually works
Here's the trick. pg_cron has a simple rule: it won't start a second copy of a job if the first one is still running. From the [documentation](https://github.com/citusdata/pg_cron):
> *pg_cron runs at most one instance of a job at a time. If a second run is supposed to start before the first one finishes, then the second run is queued and started as soon as the first run completes.*
Your procedure is always "running" as far as pg_cron is concerned. It's either napping or working. pg_cron sees it as busy and doesn't send another worker. No duplicate connections. No wasted resources. The 1-second schedule is just a safety net. If the procedure ever crashes, pg_cron restarts it within a second.
Think of it this way:
**Nobody's using your app:** The worker is napping next to the mailbox. Every second it opens one eye, sees nothing, goes back to sleep. One sleeping process. Practically zero cost.
**A user marks a todo as done:** On the next wake-up (within 1 second), the worker sees the new row, processes it, and clocks out. pg_cron sends it right back in on the next tick, and it starts napping again.
**Something goes wrong:** The procedure crashes. pg_cron doesn't care why, it just sends a new worker on the next tick. Automatic recovery, no babysitting required.
---
## Making it work well
A few things to get right so this pattern runs smoothly.
**Give the worker a fast way to check the mailbox.** That `EXISTS` check needs to be quick. Add a partial index so PostgreSQL only has to look at rows that actually need processing:
```sql
CREATE INDEX idx_todos_pending ON todos (id) WHERE done = true AND notified = false;
```
When there's nothing to do, this index is empty. Checking it takes microseconds.
**Tell pg_cron to be quiet.** Two settings in `postgresql.conf`:
```ini
cron.use_background_workers = on
cron.log_run = off
```
The first one tells pg_cron to reuse a background worker instead of opening a new database connection every time. The second one stops it from writing a log entry for every single execution. Without this, you'd end up with millions of "I ran and nothing happened" rows in `cron.job_run_details`.
**Use COMMIT if you need other queries to see the changes right away.** By default, everything the procedure does lives inside one big transaction. If you want processed rows to be visible immediately (say, your API is polling for the notification status), add `COMMIT` calls:
```sql
CREATE OR REPLACE PROCEDURE background_worker() AS $$
DECLARE
_deadline timestamptz := clock_timestamp() + interval '5 minutes';
BEGIN
LOOP
EXIT WHEN clock_timestamp() >= _deadline;
IF NOT EXISTS (SELECT 1 FROM todos WHERE done = true AND notified = false) THEN
COMMIT;
PERFORM pg_sleep(1);
CONTINUE;
END IF;
UPDATE todos SET notified = true
WHERE id IN (
SELECT id FROM todos
WHERE done = true AND notified = false
ORDER BY id LIMIT 100
FOR UPDATE SKIP LOCKED
);
COMMIT;
EXIT;
END LOOP;
END;
$$ LANGUAGE plpgsql;
```
The `COMMIT` between iterations isn't just about visibility. It also releases the procedure's transaction snapshot, which lets PostgreSQL's autovacuum clean up dead tuples in the tables you're querying. Without it, a long-running procedure can hold onto an old snapshot and silently block vacuum from doing its job, causing table bloat over time.
**Adjust the nap length.** `pg_sleep(1)` means the worker checks once per second. That's fine for most apps. If you need faster reactions, use `pg_sleep(0.5)` or even `pg_sleep(0.1)`. The cost is the same either way, a sleeping process uses almost nothing, no matter how often it peeks at an empty index.
---
## When this isn't the right fit
This pattern is great for simple background work, but it has limits.
If you need near-instant reactions (sub-second), use LISTEN/NOTIFY with an external worker process. The napping worker will always have a small delay.
If you need retries, dead letter queues, or priority lanes, you want a proper job queue like pgmq, Graphile Worker, or River. This pattern doesn't track failures or reorder work.
If you need cross-database coordination, this pattern won't help. pg_cron runs in a single database, so your worker can only touch tables in that database.
If your managed PostgreSQL provider doesn't support pg_cron or limits background workers, this won't work at all.
But for the common case, "I need to do something in the background, within a second, without adding infrastructure", this is hard to beat.
---
## A few pg_cron tips while we're here
pg_cron gets dismissed a lot. "It's a toy." "Use a real queue." And sure, if you're building a distributed pipeline with fan-out and observability dashboards, pg_cron is not your tool.
But most of us aren't building that. Most of us need to process some rows, on a schedule, without adding another service that can break at 3 AM.
A few things that help:
**Keep your logic in a procedure, not in the schedule string.** This matters more than you'd think. Debugging SQL that's buried inside a `cron.schedule()` call is miserable. Put your logic in a named procedure, schedule the procedure call. You'll thank yourself in six months.
```sql
-- Hard to read, test, or modify
SELECT cron.schedule('worker', '10 seconds', $$
WITH batch AS (
SELECT id FROM todos WHERE done = true AND notified = false
ORDER BY id LIMIT 50
FOR UPDATE SKIP LOCKED
)
UPDATE todos SET notified = true WHERE id IN (SELECT id FROM batch)
$$);
-- Much better
SELECT cron.schedule('worker', '10 seconds', 'CALL process_todos()');
```
**Clean up the log table if you leave logging on.** If you don't set `cron.log_run = off`, pg_cron writes a row to `cron.job_run_details` for every execution. With the self-sleeping pattern, quiet days are fine since the worker holds the slot and you only get a few log entries per hour (once per deadline cycle). But on a busy day, the worker keeps finding work, clocking out, and getting sent right back in on the next tick. That can mean up to 86,400 log entries per day.
To schedule a cleanup:
```sql
SELECT cron.schedule('cleanup-logs', '0 3 * * *', $$
DELETE FROM cron.job_run_details
WHERE end_time < now() - interval '3 days'
$$);
```
**The one-at-a-time rule is a feature, not a limitation.** pg_cron won't start a second instance of a job while the first is still running. That's exactly what makes the self-sleeping pattern work. Your worker holds the slot, pg_cron respects it, and you get single-worker processing without having to build any locking yourself.
**Know the boundaries.** pg_cron runs in the `postgres` database by default (you can change this with `cron.database_name`). Schedules are in UTC. The `'N seconds'` syntax supports 1–59 seconds (and requires pg_cron v1.5.0+). No job dependencies, no fan-out. If those are dealbreakers, look elsewhere. But for "run this SQL on a schedule, inside my database, with zero external dependencies" pg_cron is exactly the right tool.
**What if you don't have pg_cron v1.5.0?** On older versions, the shortest schedule is once per minute. You can still use the same pattern. But you're reacting once a minute, not once a second.
That's a 60-second worst-case delay instead of 1 second. For many background tasks that's perfectly fine. If you need sub-minute reactions, upgrade to pg_cron v1.5.0+ or use LISTEN/NOTIFY with an external worker.
---
## The full setup
Here's everything in one place. Copy, adapt the table and the processing logic to your use case, and you're done.
```sql
-- Your table (whatever it looks like)
CREATE TABLE todos (
id bigserial PRIMARY KEY,
title text NOT NULL,
done boolean DEFAULT false,
notified boolean DEFAULT false,
created_at timestamptz DEFAULT now()
);
-- Fast check for pending work
CREATE INDEX idx_todos_pending ON todos (id) WHERE done = true AND notified = false;
-- The worker
CREATE OR REPLACE PROCEDURE background_worker() AS $$
DECLARE
_deadline timestamptz := clock_timestamp() + interval '5 minutes';
BEGIN
LOOP
EXIT WHEN clock_timestamp() >= _deadline;
IF NOT EXISTS (SELECT 1 FROM todos WHERE done = true AND notified = false) THEN
COMMIT;
PERFORM pg_sleep(1);
CONTINUE;
END IF;
UPDATE todos SET notified = true
WHERE id IN (
SELECT id FROM todos
WHERE done = true AND notified = false
ORDER BY id LIMIT 100
FOR UPDATE SKIP LOCKED
);
COMMIT;
EXIT;
END LOOP;
END;
$$ LANGUAGE plpgsql;
-- The schedule
SELECT cron.schedule('bg-worker', '1 seconds', 'CALL background_worker()');
```
```ini
# postgresql.conf
cron.use_background_workers = on
cron.log_run = off
```
One table. One procedure. One pg_cron entry. Two config lines. A worker that naps when there's nothing to do and gets to work the moment something shows up.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment