A week ago I stood up a self-hosted Jenkins on a NUC and called it done. The two _smoke pipelines were green, the Docker cloud was spawning ephemeral agents, and pushing to JeakylBlog main deployed to S3. That’s “demo works”.

What I have learned since is that “demo works” leaves a long tail of small surprises. Most of these aren’t bugs in the strictest sense; they’re documented behaviour that you only meet when you push your second commit, or change a tiny thing that turns out to ripple. Each cost me twenty to forty minutes to find. None are well-Googled.

Writing them down in case any of them save someone else an evening.

1. env.X in a Jenkins shared library is not the controller’s process env

I had been setting JEAKYLBLOG_S3_BUCKET=blog.jeakyl.com in the Jenkins controller’s docker-compose environment: block and reading it from a shared-library function as env.JEAKYLBLOG_S3_BUCKET. The first project build failed with s3Bucket required. The env var was visibly there inside the Jenkins controller container; docker compose exec jj-jenkins env | grep JEAKYL confirmed it. The shared library still saw null.

The reason: in Jenkins, env.X inside a shared library reads the build’s environment, not the controller’s process environment. The build env is constructed from the agent’s inherited env, the pipeline’s environment {} block, withEnv { ... } steps, and (if configured) Jenkins’s global node-properties environment variables. Compose env vars set on the controller container do not flow into that pipeline by default. The shared library code happens to run inside the controller’s JVM during evaluation, which is why the conflation is so easy to make.

Two fixes, depending on what you want to do:

// Read the controller container's process env directly:
String s3Bucket = cfg.s3Bucket ?: System.getenv('JEAKYLBLOG_S3_BUCKET')

System.getenv() in shared-library Groovy reads the JVM’s process env, which is the controller container’s env. It works, and it’s the lightest fix. The more idiomatic JCasC route is to expose the value to all builds:

jenkins:
  globalNodeProperties:
    - envVars:
        env:
          - key: "JEAKYLBLOG_S3_BUCKET"
            value: "blog.jeakyl.com"

That makes env.JEAKYLBLOG_S3_BUCKET work from any pipeline, agent, or shared library, because it’s now genuinely in the build env. I went with System.getenv() because the values are not sensitive enough to warrant exposing them to every pipeline, and one more JCasC schema block is one more thing to break on a Jenkins upgrade. Pick whichever fits your security model.

The takeaway is that “the env var is in the controller container” and “the env var is in the build” are two different statements. They feel like one because the same JVM is doing the evaluating; they aren’t.

2. Jenkins’s CSRF crumb is per-session, not per-token

I was scripting a _smoke build trigger over the Jenkins API. The pattern that’s all over Stack Overflow goes like this:

CRUMB=$(curl --user admin:pwd $J/crumbIssuer/api/json | jq -r '"\(.crumbRequestField):\(.crumb)"')
curl -X POST -H "$CRUMB" --user admin:pwd "$J/job/_smoke/build"

That returns 403. The crumb token is real, the auth is real, but the POST refuses with Found invalid crumb in the controller log.

The reason: modern Jenkins ties the crumb to a session cookie, and the two curl invocations create two unrelated sessions. The crumb the first call returned is bound to a session you’ve already discarded by the time the second call runs.

Fix is a shared cookie jar across the GET and the POST:

JAR=$(mktemp)
curl --cookie-jar "$JAR" --cookie "$JAR" --user admin:pwd "$J/crumbIssuer/api/json" > /tmp/crumb.json
F=$(jq -r '.crumbRequestField' /tmp/crumb.json)
V=$(jq -r '.crumb'              /tmp/crumb.json)
curl --cookie-jar "$JAR" --cookie "$JAR" --user admin:pwd \
     -H "$F: $V" -X POST "$J/job/_smoke/build"

The cleaner long-term path is an API token instead of a password. API tokens bypass the crumb requirement entirely, because the token itself is the strong-auth object that CSRF protection is meant to compensate for the absence of. Until you’ve done a manual login to generate one, the cookie-jar dance is what gets you through the first hour.

The Jenkins docs do say this. They say it in a subordinate clause halfway down a page about CLI authentication, where nobody looking for “why does my crumb fail” will find it.

3. Gitea’s webhook allow-list checks the resolved IP, not the URL host

When I pointed a Gitea webhook at https://ci.jeakyl.com/gitea-webhook/post, Gitea refused to deliver, with this error:

deny 'ci.jeakyl.com(172.18.0.2:443)'

ci.jeakyl.com is a public DNS name, served by my own NPM. The error message names the host, so the obvious read is “Gitea doesn’t trust ci.jeakyl.com for some reason”. That isn’t quite what’s happening.

webhook.ALLOWED_HOST_LIST defaults to external, which means RFC 1918 IPs are blocked. The check happens after DNS resolution. Inside the Gitea container, ci.jeakyl.com resolves (via my extra_hosts map, for reasons in the previous post) to 172.18.0.2, which is private, which is denied. The error names the resolved IP next to the URL hostname, but it’s the IP that fails the test.

The fix is to whitelist your own subdomains explicitly:

environment:
  - GITEA__webhook__ALLOWED_HOST_LIST=*.jeakyl.com

Glob patterns work, and limiting to your own hostnames is the right scope for a personal instance. Avoid internal (which whitelists all RFC 1918) on a multi-tenant Gitea, since that opens an SSRF surface for any user with admin on a repo. For a single-user box it’s fine, but tighten if the threat model ever grows.

The thing to know is that the allow-list runs on the resolved IP, not the URL string. If you’re seeing your own public hostname denied with a private IP after it, check what the host resolves to inside the Gitea container, not the public DNS view.

4. Jekyll renders post dates in the build host’s timezone

I write blog posts with explicit timezones in the front matter:

date: 2026-05-03 10:00:00 +1200

Locally, the rendered home page showed “May 3” as expected. Once Jenkins started building and deploying for me, half my posts shifted by a day on the live site, but only some of them. Specifically, posts whose UTC equivalent fell on a different calendar date to the local-time the front matter named.

The reason: Jekyll’s date filter renders in the build host’s local timezone. My laptop is Pacific/Auckland; Jenkins builds in a Debian container that defaults to Etc/UTC. 2026-05-03 10:00:00 +1200 is 2026-05-02 22:00:00 UTC, and on UTC the filter prints “May 2”.

There is a one-line fix:

# _config.yml
timezone: Pacific/Auckland

That tells Jekyll to set its process timezone before rendering, so dates render in the zone you authored in regardless of where the build runs. The fix is two minutes; finding the diagnosis took an hour because the symptom is “dates wrong, but only some of them, and only on the live site”.

If your post-date front matter has explicit +nnnn zones and your local timezone differs from your build host, you have this bug right now and you don’t know it. Add the timezone: line.

5. JCasC tutorials go stale fast, and the boot log buries the schema error

Five separate JCasC schema mismatches caused boot loops over the first week. None are subtle once you find them; all were copy-pasted from tutorials that worked on a previous Jenkins or plugin version, and have since been removed or renamed.

In rough order I hit them:

  • crumbIssuer.standard.excludeClientIPFromCrumb was removed. The fix is standard: {} rather than trying to set anything under it.
  • The matrix-auth plugin’s permissions: ["USER:Overall/Administer:admin", ...] form is deprecated; the new form is entries: with explicit user: and group: blocks.
  • job-dsl’s triggers { cron('@daily') } is deprecated. The warning says triggers is deprecated but doesn’t suggest a replacement. I dropped the triggers and run smoke pipelines manually, which is what they were for anyway.
  • job-dsl’s logRotator block now requires four properties together. If you set daysToKeepStr you must also set artifactDaysToKeepStr, same for the numToKeepStr pair. Dropping the entire properties { buildDiscarder { ... } } block was easier than retro-fitting.
  • The Gitea plugin’s GiteaServer schema renamed name to displayName. The error message helpfully prints the valid attributes:
Invalid configuration elements for type: class GiteaServer : name.
[LF]> Available attributes : aliasUrl, credentialsId, displayName, manageHooks, serverUrl

That last point is the most useful Jenkins boot-log line in this whole list, because it tells you the full real schema. Whenever you see UnknownAttributesException, scroll up looking for Available attributes; it’s the schema source of truth, and almost always faster than searching the plugin’s GitHub.

The boot-log experience itself is the bigger problem. Jenkins’s BootFailure line appears six screens after the schema error, and the error itself is one line buried in three pages of Jetty/Spring/CASC initialisation noise. Once Jenkins starts crash-looping under restart: unless-stopped, you may not see the original error in docker logs --tail; you have to docker logs jj-jenkins | grep -B1 -A4 SEVERE to find the substantive one among the duplicates of the duplicates.

The strategy that worked, and would have saved me a Saturday evening, is to strip JCasC to a minimum-viable baseline whenever boot fails, commit that baseline, and add each capability back as its own commit, restarting between each. Five working commits is much better than one almost-working YAML you cannot bisect.

Summary, for the next person

If you’re past the demo and into actual builds, the five things I would commit to muscle memory:

  1. env.X in a shared library is not your controller’s process env. Use System.getenv(), or put the value in globalNodeProperties via JCasC.
  2. Jenkins CSRF crumbs are session-bound. Share a cookie jar across the crumb GET and the API POST, or skip to API tokens.
  3. Gitea webhook denials name the URL host but check the resolved IP. Whitelist your own subdomains explicitly via GITEA__webhook__ALLOWED_HOST_LIST.
  4. Pin timezone: in _config.yml if your post dates have explicit zones. Otherwise your live site shifts according to the build host.
  5. JCasC has more schema drift than you expect. Add config in small commits, and grep the boot log for Available attributes whenever you see UnknownAttributesException.

Each took an hour the first time. None will take more than five minutes the second.