Linux Systemd Service Management Explained

What Systemd Actually Is

Systemd is the init system and service manager that ships with virtually every major Linux distribution today. It replaced the old SysV init and Upstart systems, and it does a lot more than just start processes. When your kernel hands off control after boot, systemd is PID 1 — the root of the entire process tree. Everything else on that machine is a child of systemd, directly or indirectly.

The core abstraction in systemd is the unit. A unit is a configuration file that describes a resource systemd knows how to manage. There are several unit types:

service

socket

target

timer

mount

automount

path

swap

slice

, and

scope

. When people talk about "systemd services," they usually mean

.service

units, but understanding the full type system is what separates engineers who can configure systemd from people who just copy unit files from Stack Overflow.

Anatomy of a Service Unit File

Unit files live in a few key locations. The system-provided units from your distribution sit in

/usr/lib/systemd/system/

— you don't touch those. Custom or site-specific units go in

/etc/systemd/system/

, which takes precedence. If you install a package and need to override just one or two directives, you use a drop-in: create a directory at

/etc/systemd/system/nginx.service.d/

and drop a

custom.conf

file in there. Systemd merges it with the base unit automatically.

Every service unit has three main sections:

[Unit]

[Service]

, and

[Install]

. The

[Unit]

section handles metadata and dependency declarations. The

[Service]

section is where actual execution configuration lives. The

[Install]

section tells systemd which targets should pull this unit in when enabled. Here's a real-world example — a custom Python application service running on sw-infrarunbook-01:

[Unit]
Description=InfraRunBook API Service
Documentation=https://solvethenetwork.com/docs/api
After=network-online.target postgresql.service
Requires=postgresql.service

[Service]
Type=exec
User=infrarunbook-admin
Group=infrarunbook-admin
WorkingDirectory=/opt/infrarunbook/api
EnvironmentFile=/etc/infrarunbook/api.env
ExecStart=/opt/infrarunbook/venv/bin/python -m uvicorn main:app --host 192.168.10.15 --port 8000
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
RestartSec=5s
TimeoutStopSec=30s
StandardOutput=journal
StandardError=journal

# Hardening
ProtectSystem=strict
PrivateTmp=true
NoNewPrivileges=true
CapabilityBoundingSet=

[Install]
WantedBy=multi-user.target

Walk through this carefully.

After=network-online.target postgresql.service

means systemd won't start this service until both of those are active. But

After

alone is purely ordering — it doesn't create a dependency. The

Requires=postgresql.service

line does that. If postgres fails to start, systemd refuses to start this service too. In my experience, people constantly confuse

After

and

Requires

. They're orthogonal. You almost always want both.

Service Types and Why They Matter

The

Type=

directive tells systemd how to track whether your service has finished starting up. Get this wrong and you'll have ordering bugs that only appear under load or after reboots — the worst kind.

Type=simple (the historical default) tells systemd that the exec'd process is the main service process, and it's considered started as soon as it's forked. Simple is fine for processes that don't daemonize and stay in the foreground, but it gives systemd no signal about whether the application is actually ready to serve traffic.

Type=exec is like simple but systemd waits until the

exec()

call succeeds before considering the service started. Use this over simple for most modern applications. It catches failures like "the binary doesn't exist" before declaring success, and it's the correct type for anything that doesn't implement sd_notify.

Type=forking is for old-school daemons that fork into the background and exit the parent process. You almost never want this for anything you're writing today. If you're forced to use it, set

PIDFile=

as well so systemd can track the actual daemon process.

Type=notify is the gold standard when the application supports it. The service sends an

sd_notify()

message to tell systemd "I'm ready." This is how nginx, PostgreSQL, and many other mature daemons signal readiness. Systemd won't activate dependent services until it receives that notification. If you're writing a long-running application from scratch, building in sd_notify support is worth the effort.

Type=oneshot is for tasks that run and exit. Think initialization scripts, database migrations, or anything that runs once per boot. Pair it with

RemainAfterExit=yes

if you want the unit to show as "active" after the process exits — useful for units that other services declare dependencies on.

Targets Replace Runlevels

Old SysV init had runlevels — numbered states like 0 (halt), 3 (multi-user text), 5 (graphical). Systemd replaces these with targets, which are units of type

.target

that act as synchronization points. The mapping exists for compatibility: runlevel 3 maps to

multi-user.target

, runlevel 5 maps to

graphical.target

Targets are more flexible than runlevels. You can define your own. You can have a service pulled in by multiple targets. The key mechanism is

WantedBy=

in the

[Install]

section — when you run

systemctl enable

, it creates a symlink inside

/etc/systemd/system/multi-user.target.wants/

. That symlink is what causes the service to start on boot. If you've ever enabled a service and found it doesn't autostart after a reboot, checking whether that symlink actually got created is your first step.

# Check what a target pulls in
systemctl show multi-user.target -p Wants

# Check the symlinks on disk
ls -la /etc/systemd/system/multi-user.target.wants/

# Switch the default boot target
systemctl set-default multi-user.target

Dependency Graph and Ordering

The dependency model is where systemd gets genuinely powerful — and where it gets complicated. There are several relationship directives you need in your toolkit.

Requires= is a hard dependency. If the required unit fails or is stopped, this unit is stopped too. Use this when your service literally cannot function without another unit being active.

Wants= is a soft dependency. Systemd will try to start the wanted unit, but failure won't cascade. This is what most inter-service dependencies should use when you want best-effort activation without hard coupling.

BindsTo= is stronger than Requires. If the bound unit stops for any reason — including manual deactivation — this unit is immediately stopped. Useful for services tied to a specific network interface or device.

PartOf= creates one-way synchronization. If the parent unit is stopped or restarted, this unit follows. The reverse is not true. You see this in templated services where each instance is part of a group.

Conflicts= enforces mutual exclusion. If one unit is running, the other can't be. This is how

shutdown.target

conflicts with services that shouldn't be running during system shutdown.

The ordering directives —

Before=

and

After=

— are purely about sequencing within a parallel startup. Systemd boots units in parallel by default. Dependencies don't imply ordering; they say "this must be active." Ordering says "start this before that." I've seen production incidents where someone added

Requires=

without

After=

and ended up with a race condition — the service started before its dependency was fully initialized because systemd had already parallelized them.

Managing Services Day-to-Day

The daily interface for systemd is

systemctl

. The basics are well known, but a handful of commands are consistently underused in the field.

# Start, stop, restart, reload
systemctl start infrarunbook-api.service
systemctl stop infrarunbook-api.service
systemctl restart infrarunbook-api.service
systemctl reload infrarunbook-api.service

# Enable/disable autostart on boot
systemctl enable infrarunbook-api.service
systemctl disable infrarunbook-api.service

# Enable and start in one shot
systemctl enable --now infrarunbook-api.service

# Detailed status
systemctl status infrarunbook-api.service

# Check state independently
systemctl is-active infrarunbook-api.service
systemctl is-enabled infrarunbook-api.service

# Show all unit properties
systemctl show infrarunbook-api.service

# Reload unit definitions after editing files
systemctl daemon-reload

# List all loaded service units
systemctl list-units --type=service

# Show only failed units
systemctl --failed

# Show full dependency tree
systemctl list-dependencies infrarunbook-api.service

The command people forget most often is

daemon-reload

. Any time you edit a unit file — new file or modification of an existing one — you must run

systemctl daemon-reload

before systemctl will see your changes. The old unit definition stays cached in memory otherwise. This bites people in automation scripts where a unit file is written and the service is immediately restarted without reloading the daemon first. The restart succeeds, but it's running the old definition.

Logging with journald

Systemd ships with

journald

, a structured logging daemon that captures stdout and stderr from every service, plus kernel messages, audit events, and boot-time output. The query interface is

journalctl

# Follow logs for a service in real time
journalctl -u infrarunbook-api.service -f

# Show logs since current boot
journalctl -u infrarunbook-api.service -b

# Show logs from the previous boot (invaluable after a crash)
journalctl -u infrarunbook-api.service -b -1

# Filter by time range
journalctl -u infrarunbook-api.service \
  --since "2026-04-08 09:00:00" \
  --until "2026-04-08 10:00:00"

# Show only errors and above
journalctl -u infrarunbook-api.service -p err

# Output as JSON for scripted processing
journalctl -u infrarunbook-api.service -o json-pretty | head -50

# Check total journal disk usage
journalctl --disk-usage

# Vacuum old entries
journalctl --vacuum-time=30d

During incident response,

journalctl -b -1 -p err

is one of the first commands I run. It gives you all error-level and above messages from the previous boot. If a server rebooted overnight without explanation, that's your starting point. Pair it with

journalctl -b 0

to see what happened at the start of the current boot and whether recovery succeeded cleanly.

Journal persistence is controlled by

/etc/systemd/journald.conf

. On many distributions the default stores the journal in memory under

/run/log/journal/

, which means it's gone on reboot. To persist it, either create

/var/log/journal/

(systemd detects this automatically) or set

Storage=persistent

journald.conf

and restart the service. For any production host, persistent journals are non-negotiable.

Resource Control with Slices and cgroups

Every service unit runs inside a cgroup. Systemd organizes these into a hierarchy using slices. By default, system services run under

system.slice

, user sessions under

user.slice

, and virtual machines under

machine.slice

. You can create custom slices to apply resource policies across groups of related services.

Resource limits apply directly in the

[Service]

section and map straight to cgroup v2 controls:

[Service]
# Limit CPU usage to 50% of one core
CPUQuota=50%

# Limit memory hard ceiling to 512MB
MemoryMax=512M

# Set a soft memory limit triggering reclaim early
MemoryHigh=400M

# Reduce IO priority relative to other services
IOWeight=50

# Cap total tasks (threads + processes)
TasksMax=128

You can verify what systemd is enforcing by inspecting

/sys/fs/cgroup/system.slice/infrarunbook-api.service/

directly. In multi-tenant environments or on servers running mixed workloads, this is how you prevent a runaway service from consuming the entire host. I've watched a poorly written log aggregator OOM-kill an entire server because nobody had set

MemoryMax

. Thirty seconds of configuration would have contained it.

Security Hardening in Unit Files

One of the most underused features in systemd is the security sandboxing available in

[Service]

. You can significantly reduce the blast radius of a compromised service with a handful of directives — no SELinux policy or AppArmor profile required.

[Service]
# Mount /usr, /boot, /etc as read-only
ProtectSystem=strict

# Give the service its own private /tmp namespace
PrivateTmp=true

# Prevent acquiring new privileges via setuid or capabilities
NoNewPrivileges=true

# Strip all Linux capabilities
CapabilityBoundingSet=

# Restrict to a safe syscall whitelist
SystemCallFilter=@system-service

# Block write access to home directories
ProtectHome=true

# Private /dev with only basic pseudo-devices
PrivateDevices=true

# Restrict outbound IP access to internal ranges only
IPAddressAllow=127.0.0.1/8 192.168.10.0/24
IPAddressDeny=any

Run

systemd-analyze security infrarunbook-api.service

to get a scored breakdown of how well a unit is sandboxed. Each unsandboxed directive is listed with an exposure score. It's a quick triage tool for finding which services are running with unnecessary access and deciding where to prioritize hardening work. Don't try to harden everything at once — start with internet-facing or privileged services first.

Timers as a Cron Replacement

Systemd timers are a proper replacement for cron jobs. A timer unit (

.timer

) activates a corresponding service unit at defined intervals. The advantage over cron is that the job runs as a full systemd service: it logs to journald, you inspect it with

systemctl

, it respects resource controls, and you get proper failure tracking and alerting hooks.

# /etc/systemd/system/infrarunbook-backup.timer
[Unit]
Description=Daily backup timer for InfraRunBook

[Timer]
OnCalendar=daily
Persistent=true
RandomizedDelaySec=300

[Install]
WantedBy=timers.target

# /etc/systemd/system/infrarunbook-backup.service
[Unit]
Description=InfraRunBook Data Backup
After=network-online.target

[Service]
Type=oneshot
User=infrarunbook-admin
ExecStart=/opt/infrarunbook/scripts/backup.sh
StandardOutput=journal

Persistent=true

is important here. If the system is powered off when the timer would have fired, the job runs as soon as the host comes back up. Cron simply can't do this.

RandomizedDelaySec=300

spreads the start time by up to five minutes — useful when multiple hosts on the 192.168.10.0/24 segment all hit the same backup target at midnight.

List active timers and their next fire times with

systemctl list-timers

. It shows last trigger, next trigger, and the unit name. A quick sanity check for any host running scheduled jobs.

Common Misconceptions

The most persistent one I encounter is that

systemctl restart

and

systemctl reload

are interchangeable. They're not. Restart stops the process and starts a new one — new PID, brief outage, all connections dropped. Reload sends a signal (typically SIGHUP) to tell the process to re-read its configuration without stopping. Not all services support reload. If

ExecReload=

isn't defined in the unit file,

systemctl reload

fails. Always check whether reload is implemented before building automation around it.

Another one: assuming that because a service is "enabled" it's running. Enabled means it'll start at boot. It says nothing about the current runtime state. A service can be enabled and stopped, or disabled and currently running if it was started manually. Always check

is-active

and

is-enabled

independently when writing scripts that need to make decisions based on both.

The third misconception I see regularly involves

KillMode=

. The default is

control-group

, which means when systemd stops a service it kills the entire cgroup — every process the service spawned, including workers and children. This is almost always the right behavior. Developers sometimes complain that their child workers are "being killed unexpectedly" on service restart. That's not unexpected — that's correct behavior. If you have a legitimate need for child processes to outlive a service restart, you need

KillMode=process

and a clear architectural reason to justify it. Reach for it rarely.

Finally, socket activation surprises a lot of people when they first encounter it. Systemd can listen on a socket on behalf of a service and only start the actual daemon when a connection arrives. This is what

.socket

units do. It's how systemd achieves fast apparent boot times — sockets are ready immediately; the daemon starts on first use. If you're building a service that needs to be reachable early in the boot sequence but rarely receives traffic, socket activation is worth adding. The service unit doesn't need any special configuration; systemd passes the pre-opened socket file descriptor automatically via the

LISTEN_FDS

environment variable.

Operational note: After any unit file change on sw-infrarunbook-01 or any host you manage, make
systemctl daemon-reload
part of your muscle memory. Skipping it is the single most common source of "why isn't my config change taking effect" confusion in systemd environments.

Linux Systemd Service Management Explained

What Systemd Actually Is

Anatomy of a Service Unit File

Service Types and Why They Matter

Targets Replace Runlevels

Dependency Graph and Ordering

Managing Services Day-to-Day

Logging with journald

Resource Control with Slices and cgroups

Security Hardening in Unit Files

Timers as a Cron Replacement

Common Misconceptions

Related Articles

Frequently Asked Questions

What is the difference between systemctl enable and systemctl start?

What is the difference between Requires= and Wants= in a systemd unit file?

Why does my systemd service still use the old configuration after I edited the unit file?

How do I view logs for a specific systemd service?

What is socket activation in systemd and when should I use it?

Related Articles