InfraRunBook
    Back to articles

    Linux File System and Mount Points Explained

    Linux
    Published: Apr 7, 2026
    Updated: Apr 7, 2026

    A deep-dive into how Linux file systems and mount points work under the hood — from the VFS layer and inodes to /etc/fstab, bind mounts, and production hardening strategies every infrastructure engineer should know.

    Linux File System and Mount Points Explained

    What the Linux File System Actually Is

    When most engineers say "file system" they mean the format on disk — ext4, XFS, Btrfs. But Linux uses the term in a broader sense, and if you conflate the two meanings you'll confuse yourself badly the first time you encounter something like

    proc
    or
    tmpfs
    . A Linux file system is any hierarchical namespace that exposes data through the standard file operations:
    open
    ,
    read
    ,
    write
    ,
    stat
    ,
    readdir
    . The storage backend is irrelevant. Disk, RAM, kernel data structures, network — it doesn't matter. If you can mount it and navigate it with a path, it's a file system as far as Linux is concerned.

    The piece that makes this possible is the Virtual File System, or VFS. VFS is a kernel abstraction layer that sits between system calls and the concrete file system drivers. When your process calls

    open("/var/log/syslog", O_RDONLY)
    , the kernel doesn't know or care whether
    /var/log
    sits on an ext4 partition, an NFS share, or an overlay mount. VFS translates the call into driver-specific operations and returns a file descriptor. This is why "everything is a file" isn't just a philosophy — it's a kernel engineering decision with real consequences for how you build and manage systems.

    Under VFS, four key data structures do the heavy lifting. The superblock represents a mounted file system instance and stores global metadata: block size, inode count, flags, and a pointer to the file system operations table. The inode stores per-file metadata — permissions, ownership, timestamps, and block pointers — but notably not the file name. The dentry (directory entry) maps a name to an inode and is cached aggressively in the dentry cache for performance. Finally, the file object represents an open file descriptor in a process and tracks the current position within the file. Understanding that names live in dentries and not inodes is the key to understanding hard links, which I'll come back to later.

    How Mount Points Work

    A mount point is simply a directory in the existing tree where a new file system is grafted. When you run

    mount
    , the kernel calls the
    mount(2)
    syscall, which allocates a new superblock for the target file system, creates a mount structure, and attaches it to the mount tree at the specified path. From that moment on, any path resolution that reaches that directory gets handed off to the new file system's driver rather than continuing through the parent. The directory itself — the mount point — isn't deleted or modified. It's hidden behind the newly attached tree. Unmount it and the original directory reappears, contents intact.

    Linux maintains a per-namespace mount tree. In the early days this was a single global tree, but since Linux 3.8 every process can have its own mount namespace, which is fundamental to how containers work. When you run

    unshare --mount
    or create a new mount namespace via
    clone(2)
    with
    CLONE_NEWNS
    , the child gets a copy of the parent's mount tree. Changes made inside the namespace — new mounts, unmounts — don't affect the parent unless you're using shared propagation, which I'll get to in a moment.

    Mount propagation is one of those topics that reads like simple documentation until you break production with it. There are four propagation types. Shared mounts propagate events bidirectionally between peer groups — mount something inside a shared mount, and it appears in all peers. Slave mounts receive events from a master but don't send them back. Private mounts have no propagation at all. Unbindable mounts are private and additionally can't be bind-mounted. In my experience, the default on most modern distributions is shared propagation for the root file system, which means that if you mount something inside a container's namespace without thinking about this, it can leak back to the host. Always check with

    findmnt -o TARGET,PROPAGATION
    before assuming isolation.

    # Show mount tree with propagation flags
    findmnt --tree -o TARGET,SOURCE,FSTYPE,OPTIONS,PROPAGATION
    
    # Sample output excerpt
    TARGET                SOURCE      FSTYPE      OPTIONS                  PROPAGATION
    /                     /dev/sda1   ext4        rw,relatime              shared
    ├─/sys                sysfs       sysfs       rw,nosuid,nodev,noexec   shared
    ├─/proc               proc        proc        rw,nosuid,nodev,noexec   shared
    ├─/dev                devtmpfs    devtmpfs    rw,nosuid                shared
    │ ├─/dev/pts          devpts      devpts      rw,nosuid,noexec         shared
    │ └─/dev/shm          tmpfs       tmpfs       rw,nosuid,nodev          shared
    └─/data               /dev/sdb1   xfs         rw,relatime              shared

    The Filesystem Hierarchy Standard and Why It's Laid Out That Way

    The FHS isn't arbitrary. Every major directory split exists because of a real operational concern — either separability for independent mounting, performance characteristics, or administrative convenience.

    /usr
    was historically a separate partition because it held read-only user programs that could be shared over NFS across workstations.
    /var
    holds variable data — logs, spool files, package databases — and belongs on its own partition so that a runaway log file can't fill the root and take down the system.
    /tmp
    is volatile by design.
    /boot
    often needs to live on a partition that the BIOS or UEFI firmware can access before the kernel is running, which can constrain the file system type and encryption options.

    In production I've seen the root file system fill to 100% more times than I can count, and it's almost always

    /var/log
    that's responsible. A properly partitioned server carves
    /var
    off the root so that log growth never threatens system stability. Similarly, on database servers I'll put
    /var/lib/postgresql
    or
    /var/lib/mysql
    on a dedicated XFS partition with mount options tuned for write throughput. Keeping the database data path on its own mount point means you can snapshot it cleanly, remount it read-only for a consistent backup, or replace the underlying block device without touching anything else.

    /etc/fstab — The Configuration File You Must Not Ignore

    Every persistent mount is defined in

    /etc/fstab
    . Each line has six fields: device, mount point, file system type, mount options, dump frequency, and fsck pass order. The device field accepts block device paths, but you should almost never use bare
    /dev/sdX
    paths in production. Device names are not stable across reboots — a disk that enumerates as
    /dev/sdb
    today might come up as
    /dev/sdc
    tomorrow if another disk is added or the enumeration order changes. Use UUIDs or labels instead.

    # Get UUID and label info for all block devices
    blkid
    
    # /dev/sda1: UUID="a1b2c3d4-e5f6-7890-abcd-ef1234567890" TYPE="ext4" LABEL="root"
    # /dev/sdb1: UUID="f9e8d7c6-b5a4-3210-fedc-ba9876543210" TYPE="xfs" LABEL="data"
    
    # Correct fstab entries using UUID
    UUID=a1b2c3d4-e5f6-7890-abcd-ef1234567890  /       ext4  defaults,relatime          0 1
    UUID=f9e8d7c6-b5a4-3210-fedc-ba9876543210  /data   xfs   defaults,relatime,noatime  0 2
    tmpfs                                       /tmp    tmpfs mode=1777,nosuid,nodev     0 0

    The last two fields trip up junior engineers constantly. The dump field (fifth column) should be

    0
    for virtually everything — it controls the legacy
    dump
    utility which nobody actually uses. The fsck pass field (sixth column) controls whether and when
    fsck
    checks the file system at boot.
    0
    means skip,
    1
    means check first (root only),
    2
    means check after root. If you have XFS or Btrfs file systems, set this to
    0
    — those file systems use their own recovery mechanisms and running
    fsck
    on them is either a no-op or actively harmful.

    Mount Options That Actually Matter in Production

    The options field is where you harden your mounts and tune performance. Don't leave everything as

    defaults
    and call it done. Let me walk through the options I actually configure on production systems at sw-infrarunbook-01.

    noexec prevents execution of binaries directly from the mount point. I put this on

    /tmp
    ,
    /var/tmp
    , and any partition that doesn't need to run executables. It won't stop a determined attacker, but it meaningfully raises the cost of exploiting a write vulnerability to execute a payload. nosuid ignores the setuid and setgid bits on files in that mount, which prevents privilege escalation through setuid binaries dropped onto world-writable mounts. nodev prevents device file interpretation, which matters on any mount that untrusted users can write to.

    relatime versus noatime is a performance decision. By default, every file read updates the access time (

    atime
    ) on the inode, which turns every read into a write and can thrash your storage on read-heavy workloads.
    noatime
    disables atime updates entirely.
    relatime
    , the common middle ground, only updates atime when it's older than mtime or ctime — which satisfies most applications that check atime while eliminating the worst-case write amplification. On busy log servers and database hosts, switching from default atime to relatime can make a measurable difference in I/O utilization.

    # Hardened /tmp in fstab
    tmpfs  /tmp     tmpfs  rw,nosuid,nodev,noexec,relatime,size=2G,mode=1777  0 0
    tmpfs  /var/tmp tmpfs  rw,nosuid,nodev,noexec,relatime,size=1G,mode=1777  0 0
    
    # Verify options are applied after mount
    grep '/tmp' /proc/mounts

    Bind Mounts and Their Practical Uses

    A bind mount takes an existing directory (or file) in the tree and makes it appear at a second location simultaneously. Both paths point to the same underlying data — there's no copy. The kernel simply attaches the source's dentry tree at the target location. Bind mounts are powerful because they let you reshape the namespace without moving data on disk.

    The most common production use case I reach for is exposing a specific subdirectory into a chroot or container. If a service runs in a chroot at

    /srv/chroot/dns
    and it needs access to
    /etc/resolv.conf
    , you don't copy the file — you bind-mount it in. When the host updates
    /etc/resolv.conf
    , the chroot sees the change immediately because there's only one inode.

    # Bind-mount a single file into a chroot
    mount --bind /etc/resolv.conf /srv/chroot/dns/etc/resolv.conf
    
    # Make it read-only inside the chroot
    mount --bind /etc/resolv.conf /srv/chroot/dns/etc/resolv.conf
    mount -o remount,ro,bind /srv/chroot/dns/etc/resolv.conf
    
    # Bind-mount in /etc/fstab
    /etc/resolv.conf  /srv/chroot/dns/etc/resolv.conf  none  bind,ro  0 0

    Another case where bind mounts shine is testing. On sw-infrarunbook-01, when I need to test a new configuration directory structure without modifying the live path, I'll bind-mount the test directory over the live one for the duration of the test and unmount when done. No file moves, no risk of leaving behind a symlink.

    Special File Systems: proc, sysfs, devtmpfs, and tmpfs

    These aren't storage file systems — they're kernel interfaces that happen to speak the VFS protocol.

    proc
    exposes process information and kernel state.
    /proc/mounts
    is what the kernel actually has mounted, which is the authoritative source — more reliable than parsing
    /etc/fstab
    , which is just a wishlist.
    /proc/meminfo
    ,
    /proc/cpuinfo
    ,
    /proc/net/
    — all dynamically generated from kernel data structures on every read. No disk I/O happens.

    sysfs
    , mounted at
    /sys
    , exports kernel object hierarchies — devices, drivers, buses, power management. It's the kernel's preferred modern interface for device configuration. When you write to
    /sys/block/sda/queue/scheduler
    to change the I/O scheduler for a disk, you're triggering a kernel function through what looks like a file write.
    devtmpfs
    manages the device nodes under
    /dev
    dynamically, creating and removing nodes as devices appear and disappear.

    tmpfs
    is a real file system backed by virtual memory — RAM and swap. It performs extremely well because there's no disk, but its contents disappear on unmount. I use it for
    /tmp
    ,
    /run
    , and on systems with enough RAM, for scratch space in data pipelines where intermediate results don't need to survive a reboot. Always set a size limit on tmpfs mounts. The default limit is half of physical RAM, and I've seen servers run out of memory because an application hammered
    /tmp
    with temporary files and nothing enforced a ceiling.

    How Inodes Connect to All of This

    Each file system has a fixed inode table allocated at format time (for ext4; XFS allocates dynamically). Running out of inodes is a different failure mode from running out of disk space, and it can be just as fatal. You'll see "No space left on device" even with gigabytes free on disk. This happens most commonly when an application creates enormous numbers of small files — mail spools, PHP session directories, package manager caches.

    # Check inode usage per mount point
    df -i
    
    # Filesystem      Inodes   IUsed   IFree IUse% Mounted on
    # /dev/sda1      3932160  312000 3620160    8% /
    # /dev/sdb1      6553600 6553590      10  100% /data
    
    # Find directories with massive inode consumption
    find /data -xdev -printf '%h\n' | sort | uniq -c | sort -rn | head -20

    Hard links are a direct consequence of inode architecture. A hard link is a dentry that points to an existing inode — a second name for the same underlying data. The inode has a link count field that tracks how many dentries reference it. The data is only freed when the link count drops to zero and no process has the file open. Symbolic links, by contrast, are their own inodes containing a path string. They can cross file system boundaries because they're just path pointers; hard links cannot, because dentries are only meaningful within a single superblock's namespace.

    Common Misconceptions

    The biggest one I hear: "mounting replaces the directory." It doesn't. The directory still exists on the underlying file system — it's just hidden while the mount is active. Unmount the file system and the directory, including any files you might have accidentally left there before mounting, comes back. I've seen engineers create files in what they thought was a mounted directory, only to realize later they were writing to the hidden layer underneath. Always verify with

    findmnt
    or
    mount | grep target
    before writing to a path you expect to be mounted.

    Second misconception: "NFS mounts work like local mounts." They do at the VFS interface level, but semantics differ in ways that will ruin your day. NFS has weaker consistency guarantees, atime behavior can differ based on server settings, file locking uses a separate daemon (rpcbind/statd), and a network partition will cause your mount to hang indefinitely by default unless you use the

    soft
    and
    timeo
    options — which introduce their own tradeoffs around silent data corruption on write failures. NFS deserves its own article, but know that mounting it and treating it as ext4 is a mistake.

    Third: "

    /etc/mtab
    is authoritative." On modern systems,
    /etc/mtab
    is a symlink to
    /proc/self/mounts
    , which is the kernel's own view of your mount table. This is actually correct behavior. On older systems where
    /etc/mtab
    was a real file, it could drift out of sync with actual mounts if a mount operation failed to update it cleanly — a particularly nasty failure mode after a crash. If you're troubleshooting mounts, always read
    /proc/mounts
    directly and treat it as ground truth.

    # The authoritative mount table
    cat /proc/mounts
    
    # Or with more human-readable output
    findmnt
    
    # Find what's mounted on a specific path
    findmnt /data
    
    # TARGET  SOURCE     FSTYPE  OPTIONS
    # /data   /dev/sdb1  xfs     rw,relatime,attr2,inode64,logbufs=8,noquota

    In my experience, engineers who invest the time to understand VFS, mount propagation, and inode semantics stop treating storage as a black box and start making better decisions — about partition layout, about mount options, about how containers interact with the host file system. It's foundational knowledge that pays dividends every time you're debugging a full disk, a permission problem, or an unexpected mount behavior in a containerized environment. Get comfortable with

    findmnt
    , understand what's in
    /proc/mounts
    , and never use bare device paths in fstab again.

    Frequently Asked Questions

    What is the difference between a file system and a mount point in Linux?

    A file system is a structured format for organizing and storing data — examples include ext4, XFS, and tmpfs. A mount point is a directory in the existing directory tree where a file system is attached. The file system provides the data; the mount point provides the location in the hierarchy where that data becomes accessible.

    Why should I use UUIDs instead of device names like /dev/sdb1 in /etc/fstab?

    Device names like /dev/sdb1 are assigned by the kernel at boot based on enumeration order, which is not guaranteed to be stable. If you add a disk or change hardware, a device that was /dev/sdb1 may become /dev/sdc1 at next boot. UUIDs are embedded in the file system superblock and remain constant regardless of device order, making them the reliable choice for persistent mount configuration.

    What is a bind mount and when would you use one?

    A bind mount makes an existing file or directory appear at a second location in the file system tree without copying data. Both paths reference the same underlying inode. Common uses include exposing host paths inside chroots or containers, making a specific directory accessible at a more convenient path, and remounting a directory read-only at an alternative location for access control purposes.

    Can you run out of inodes even with free disk space?

    Yes. ext4 allocates a fixed inode table at format time. If your workload creates millions of small files — mail spools, session files, package caches — the inode table can exhaust before disk space does. You'll receive a 'No space left on device' error even with gigabytes available. Use 'df -i' to check inode utilization per mount point. XFS avoids this by allocating inodes dynamically.

    What is the VFS layer in Linux and why does it matter?

    VFS (Virtual File System) is a kernel abstraction layer that provides a uniform interface between system calls like open(), read(), and write() and the concrete file system drivers beneath them. It's why you can interact with ext4 partitions, NFS shares, proc, tmpfs, and device files through the same POSIX API. Understanding VFS helps you reason about why everything from /proc to a network share behaves like a file in Linux.

    Related Articles