Skip to content

Instantly share code, notes, and snippets.

@oetiker
Last active February 6, 2026 10:13
Show Gist options
  • Select an option

  • Save oetiker/92b4339e03968ecba66af617f3cf22a4 to your computer and use it in GitHub Desktop.

Select an option

Save oetiker/92b4339e03968ecba66af617f3cf22a4 to your computer and use it in GitHub Desktop.
lxcfs mount namespace helper for resource-aware applications

lxcfs-ns - Resource-Aware Process Isolation

A setuid helper that creates mount namespaces for lxcfs, enabling applications like Chrome and Firefox to see their actual cgroup resource limits instead of the full host resources.

The Problem

On shared multi-user systems (like ThinLinc terminal servers), you typically use systemd resource controls to limit per-user CPU and memory. However, applications read /proc/meminfo and /proc/cpuinfo directly, seeing the full host resources rather than their allocated limits.

This causes problems:

  • Chrome/Chromium sizes its renderer processes and caches based on "available" memory
  • Firefox does the same for content processes
  • Java applications auto-configure heap sizes based on visible RAM
  • Users see misleading values in htop, free, etc.

A user limited to 8GB on a 256GB host will have Chrome try to use far more memory than allowed, leading to OOM kills and poor performance.

The Solution

lxcfs is a FUSE filesystem that provides cgroup-aware versions of /proc files. lxcfs-ns creates a mount namespace where these virtual files replace the real /proc entries, making applications see accurate resource limits.

Prerequisites

1. Install lxcfs

# Debian/Ubuntu
sudo apt install lxcfs

# RHEL/Rocky/Alma 8+
sudo dnf install epel-release
sudo dnf install lxcfs

# Verify it's running
systemctl status lxcfs
ls /var/lib/lxcfs/proc/

2. Configure systemd User Resource Limits

Create a drop-in for user slices. Example for limiting all users:

# /etc/systemd/system/user-.slice.d/50-resource-limits.conf
[Slice]
MemoryMax=8G
MemoryHigh=7G
CPUQuota=400%
TasksMax=500

Or per-user limits:

# /etc/systemd/system/user-1001.slice.d/50-limits.conf
[Slice]
MemoryMax=16G
CPUQuota=800%

Apply changes:

sudo systemctl daemon-reload

Building and Installing lxcfs-ns

# Compile
gcc -o lxcfs-ns lxcfs-ns.c

# Install with setuid
sudo install -o root -g root -m 4755 lxcfs-ns /usr/local/bin/

Usage

Basic Usage

# Run any command with accurate resource visibility
lxcfs-ns free -h
lxcfs-ns htop
lxcfs-ns cat /proc/meminfo

# Start a shell with accurate limits
lxcfs-ns bash

Running Browsers

# Chrome/Chromium
lxcfs-ns google-chrome
lxcfs-ns chromium-browser

# Firefox
lxcfs-ns firefox

ThinLinc Integration

ThinLinc is a remote desktop solution for multi-user Linux servers. Integrating lxcfs-ns ensures each user's desktop session sees accurate resource limits.

Method 1: Per-User ~/.thinlinc/xstartup (Simplest)

ThinLinc checks for ~/.thinlinc/xstartup before running the system default. Simply create this file:

mkdir -p ~/.thinlinc
cat > ~/.thinlinc/xstartup << 'EOF'
#!/bin/bash
# ~/.thinlinc/xstartup - session with memory-aware /proc via lxcfs
#
# All processes in this session will see cgroup memory limits
# instead of full system RAM in /proc/meminfo
#
# To disable: rename or delete this file
exec /usr/local/bin/lxcfs-ns /opt/thinlinc/etc/xstartup.default
EOF
chmod +x ~/.thinlinc/xstartup

Log out and back in - your entire ThinLinc session now sees accurate resource limits.

Method 2: System-Wide via xstartup.d

For all users, add a script to /opt/thinlinc/etc/xstartup.d/:

# /opt/thinlinc/etc/xstartup.d/05-lxcfs-ns.sh
#!/bin/bash
# Re-exec under lxcfs-ns if not already in namespace
if [ -z "$LXCFS_NS_ACTIVE" ]; then
    export LXCFS_NS_ACTIVE=1
    exec /usr/local/bin/lxcfs-ns "$0" "$@"
fi

Method 3: Modify the Default Session Command

In /opt/thinlinc/etc/conf.d/vsmagent.hconf, you can configure a custom session wrapper:

# Create wrapper script first

Create /usr/local/bin/thinlinc-session-wrapper:

#!/bin/bash
exec /usr/local/bin/lxcfs-ns /etc/X11/Xsession
sudo chmod +x /usr/local/bin/thinlinc-session-wrapper

Method 4: Desktop File Wrappers for Specific Applications

If you only want browsers to see limits, create wrapper desktop files:

# /usr/local/share/applications/chrome-limited.desktop
[Desktop Entry]
Name=Google Chrome (Resource Limited)
Exec=/usr/local/bin/lxcfs-ns /usr/bin/google-chrome-stable %U
Type=Application
Icon=google-chrome
Categories=Network;WebBrowser;

Method 5: Shell Profile Integration

Add to /etc/profile.d/lxcfs-ns.sh:

# Automatically wrap interactive shells in lxcfs namespace
if [ -z "$LXCFS_NS_ACTIVE" ] && [ -x /usr/local/bin/lxcfs-ns ] && [ -d /var/lib/lxcfs/proc ]; then
    export LXCFS_NS_ACTIVE=1
    exec /usr/local/bin/lxcfs-ns "$SHELL" -l
fi

Verification

After setup, verify the namespace is working:

# Outside namespace (shows full host RAM)
cat /proc/meminfo | grep MemTotal

# Inside namespace (shows cgroup limit)
lxcfs-ns cat /proc/meminfo | grep MemTotal

# Check if you're in the namespace
lxcfs-ns bash -c 'ls -la /proc/meminfo'
# Should show it's a regular file, not a symlink

Complete Example Setup

# 1. Install lxcfs
sudo apt install lxcfs
sudo systemctl enable --now lxcfs

# 2. Set up per-user memory limits (8GB per user)
sudo mkdir -p /etc/systemd/system/user-.slice.d
cat << 'EOF' | sudo tee /etc/systemd/system/user-.slice.d/50-resource-limits.conf
[Slice]
MemoryMax=8G
MemoryHigh=7G
CPUQuota=400%
EOF
sudo systemctl daemon-reload

# 3. Build and install lxcfs-ns
gcc -o lxcfs-ns lxcfs-ns.c
sudo install -o root -g root -m 4755 lxcfs-ns /usr/local/bin/

# 4. Test
lxcfs-ns free -h
# Should show ~8GB total instead of host RAM

How It Works

  1. lxcfs runs as a FUSE filesystem, providing virtual /proc files at /var/lib/lxcfs/proc/ that read cgroup limits
  2. lxcfs-ns is a setuid binary that:
    • Creates a new mount namespace (requires root, hence setuid)
    • Bind-mounts lxcfs files over real /proc entries
    • Drops privileges back to the calling user
    • Executes the requested command
  3. Processes in this namespace see cgroup-accurate values in /proc/meminfo, /proc/cpuinfo, etc.

Security Notes

  • The binary is setuid root but immediately drops privileges after creating the namespace
  • No user namespace is created, so setuid binaries (sudo, etc.) continue to work
  • Podman/Docker work normally as they create their own namespaces
  • Mount propagation is set to "slave" so host mount changes propagate in, but namespace changes don't leak out

Troubleshooting

lxcfs not available: Ensure lxcfs is installed and running:

sudo systemctl status lxcfs
ls /var/lib/lxcfs/proc/

Permission denied: Ensure lxcfs-ns has setuid bit:

ls -la /usr/local/bin/lxcfs-ns
# Should show: -rwsr-xr-x root root

Values still show host resources: Check you're actually in the namespace:

cat /proc/self/mountinfo | grep lxcfs

License

Public Domain / CC0

/*
* lxcfs-ns3.c - Setuid helper to create mount namespace for lxcfs
*
* This version does NOT create a user namespace, so setuid/setgid binaries
* (sudo, unix_chkpwd, etc.) continue to work normally.
*
* The setuid helper creates the mount namespace as root, performs the
* bind mounts, then drops privileges before exec'ing the user's shell.
*
* Podman/Docker should still work because they create their OWN user
* namespaces when running containers.
*
* Compile: gcc -o lxcfs-ns3 lxcfs-ns3.c
* Install: sudo chown root:root lxcfs-ns3 && sudo chmod 4755 lxcfs-ns3
*
* Usage: lxcfs-ns3 <command> [args...]
*/
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/mount.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <errno.h>
#include <fcntl.h>
#include <grp.h>
#include <pwd.h>
static const char *lxcfs_mounts[][2] = {
{"/var/lib/lxcfs/proc/meminfo", "/proc/meminfo"},
{"/var/lib/lxcfs/proc/cpuinfo", "/proc/cpuinfo"},
{"/var/lib/lxcfs/proc/stat", "/proc/stat"},
{"/var/lib/lxcfs/proc/uptime", "/proc/uptime"},
{"/var/lib/lxcfs/proc/loadavg", "/proc/loadavg"},
{"/var/lib/lxcfs/proc/diskstats", "/proc/diskstats"},
{"/var/lib/lxcfs/proc/swaps", "/proc/swaps"},
{NULL, NULL}
};
int main(int argc, char *argv[]) {
uid_t real_uid = getuid();
gid_t real_gid = getgid();
if (argc < 2) {
fprintf(stderr, "Usage: %s <command> [args...]\n", argv[0]);
return 1;
}
/* Check lxcfs is available */
struct stat st;
if (stat("/var/lib/lxcfs/proc/meminfo", &st) != 0) {
fprintf(stderr, "lxcfs not available at /var/lib/lxcfs\n");
return 1;
}
/* Create mount namespace (we're running as root via setuid) */
if (unshare(CLONE_NEWNS) != 0) {
perror("unshare(CLONE_NEWNS)");
return 1;
}
/* Make all mounts slave so our changes don't propagate out,
* but parent mounts can still propagate in (needed for podman) */
if (mount(NULL, "/", NULL, MS_REC | MS_SLAVE, NULL) != 0) {
perror("mount(/, MS_SLAVE)");
return 1;
}
/* Perform bind mounts - we have root privileges */
int mount_count = 0;
for (int i = 0; lxcfs_mounts[i][0] != NULL; i++) {
/* Check if source exists */
if (stat(lxcfs_mounts[i][0], &st) != 0)
continue;
if (mount(lxcfs_mounts[i][0], lxcfs_mounts[i][1], NULL, MS_BIND, NULL) == 0) {
mount_count++;
} else {
fprintf(stderr, "warning: mount(%s -> %s): %s\n",
lxcfs_mounts[i][0], lxcfs_mounts[i][1], strerror(errno));
}
}
if (mount_count == 0) {
fprintf(stderr, "error: no lxcfs mounts succeeded\n");
return 1;
}
/* Drop privileges back to the real user */
/* Restore user's supplementary groups (kvm, docker, etc.) */
struct passwd *pw = getpwuid(real_uid);
if (pw != NULL) {
initgroups(pw->pw_name, real_gid);
}
if (setgid(real_gid) != 0) {
perror("setgid");
return 1;
}
if (setuid(real_uid) != 0) {
perror("setuid");
return 1;
}
/* Verify we dropped privileges */
if (geteuid() == 0 || getegid() == 0) {
fprintf(stderr, "error: failed to drop privileges\n");
return 1;
}
/* Execute the requested command as the unprivileged user */
execvp(argv[1], &argv[1]);
perror("execvp");
return 127;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment