root@linux.ximg.app — bash

🐧

Linux Distributions

The major distros — history, philosophy, and what makes each unique.

Linus Torvalds — Creator of the Linux kernel & Git

Linus Torvalds

Born December 28, 1969 · Helsinki, Finland

Linus Benedict Torvalds is one of the most consequential engineers in the history of computing. At 21 years old, while studying at the University of Helsinki, he wrote the first version of the Linux kernel in his bedroom — and casually announced it on the comp.os.minix Usenet group as "just a hobby, won't be big and professional like gnu."

That hobby now runs 97% of the world's top web servers, all 500 of the TOP500 supercomputers, every Mars rover, the New York Stock Exchange, and over 3 billion Android devices. Linux is the single most widely deployed operating system in history — and the largest collaborative software project ever created.

In 2005, frustrated with the proprietary BitKeeper VCS, Torvalds wrote Git from scratch in roughly two weeks. Git became the world's dominant version control system, now used by virtually every developer on the planet through GitHub, GitLab, and Bitbucket.

Known for his blunt communication style, Torvalds famously called bad code "complete and utter garbage," told NVIDIA they were "the single worst company" he'd ever dealt with (on camera), and has maintained the kernel's technical integrity through 30+ years of relentless contribution — sometimes accepting patches at 3am, sometimes rejecting them with colorful language that became internet legend.

In 2018 he took a rare break to work on his communication style. He returned. The kernel kept shipping. The man is, by any measure, an engineering legend.

Key Figures & Moments

Timeline of Linux History

1983

Richard Stallman launches the GNU Project

Stallman announces a mission to build a completely free, Unix-compatible operating system. He writes Emacs, GCC, glibc, bash, and dozens of essential tools — but the kernel remains elusive for nearly a decade.

1991 · August 25

The Usenet post that changed computing forever

Linus posts to comp.os.minix: "I'm doing a (free) operating system (just a hobby, won't be big and professional like gnu) for 386(486) AT clones." Linux 0.01 follows in September with 10,239 lines of code and support for a single CPU.

1992

Linux adopts the GPL — freedom locked in forever

Torvalds re-licenses Linux under the GNU General Public License (GPLv2). The GNU userland + Linux kernel finally forms a complete, fully free OS. The two projects that would change the world are united.

1994 · March 14

Linux 1.0 — the first stable release

Three years of rapid development culminate in 1.0 with 176,250 lines of code. It supports TCP/IP networking, runs on x86, and has grown from a one-man project to dozens of contributors. Red Hat is founded the same year.

1996

Tux the penguin becomes the Linux mascot

Larry Ewing creates the iconic Tux penguin using GIMP. Torvalds had been bitten by a penguin at a zoo in Canberra and liked the image of a "fat, happy penguin sitting down after eating a lot of fish." Tux becomes one of tech's most enduring logos.

1998

The Halloween Documents — Microsoft is scared

Leaked Microsoft internal memos acknowledge Linux as a serious competitive threat and outline tactics to combat open-source. IBM announces a $1 billion Linux investment. Enterprise Linux is born.

2003

Linux 2.6 — the modern kernel era begins

Kernel 2.6 delivers the O(1) scheduler, NPTL threading, SELinux, inotify, and dramatically improved SMP scaling. This is the release that powers Linux's conquest of the data center, embedding itself in everything from routers to supercomputers.

2005 · April

Linus writes Git in two weeks out of spite

After a falling-out with the BitKeeper license, Torvalds writes an entirely new distributed version control system in ~10 days. Git is designed around his frustrations with existing tools — fast, distributed, content-addressable. It becomes the foundation of all modern software development.

2007

Android launches — Linux enters every pocket

Google releases Android, built on the Linux kernel. Within five years it captures over 80% of the global smartphone market. Billions of people interact with Linux daily without knowing it. Torvalds publicly notes he is "not a huge Android fan" despite this.

2011

Linux turns 20 — 15 million lines of code

The kernel celebrates its 20th anniversary. Over 1,000 developers from hundreds of companies contribute to each major release. The development pace: roughly 7 patches merged per hour, 24 hours a day, every day of the year.

2019

Linux captures the first image of a black hole

The Event Horizon Telescope uses Linux-based systems across eight observatories to process petabytes of data, producing humanity's first direct image of a black hole (M87*). Linux underlies much of modern scientific computing — from CERN's LHC to NASA missions.

2022

Rust joins the kernel — the first new language in 30 years

Linux 6.1 introduces Rust as a second supported language for kernel development, alongside C. It's the most significant language addition in the kernel's history, aimed at eliminating entire classes of memory-safety bugs that have plagued system software for decades.

2024

40,000+ contributors · Linux is everywhere

The Linux kernel has received contributions from over 40,000 individual developers across its lifetime. It runs 97% of the world's supercomputers, 90%+ of cloud infrastructure, all Mars rovers, the NYSE, every Android phone, and the server powering this very page.

Kernel Versions — Major Milestones

⚙️

The Linux Kernel

Monolithic, modular, and relentlessly optimized — the core of every Linux system.

Architecture

Monolithic Design

The kernel runs entirely in a single address space (ring 0). All subsystems — scheduling, memory, drivers, filesystems — share the same memory and can call each other directly with no context-switch overhead. This is why Linux is fast.

Loadable Kernel Modules

Despite being monolithic, Linux supports dynamically loading and unloading modules (.ko files) at runtime. Device drivers, filesystems, and network protocols can be added without rebooting. lsmod, insmod, rmmod.

Preemptive Multitasking

The kernel can preempt even kernel-mode code (with CONFIG_PREEMPT). PREEMPT_RT patches push this further, turning Linux into a real-time OS used in industrial control systems, audio production, and robotics.

SMP & NUMA

Linux scales from a single-core embedded CPU to a 6000-core supercomputer node. NUMA-aware memory allocation, per-CPU data structures, and lock-free algorithms make it competitive on the largest x86, ARM, and RISC-V machines.

Rust in the Kernel

Since Linux 6.1 (2022), Rust is a first-class language alongside C. Rust's ownership model eliminates whole classes of memory-safety bugs. Drivers and subsystems are gradually being written or rewritten in Rust.

System Calls

User space communicates with the kernel via ~400 system calls: read, write, mmap, clone, execve, ioctl. The ABI is stable — a binary from 1995 can still run on Linux 6.x.

Subsystem Map

Layer	Responsibility	Source Dir
Process Scheduler	CFS, RT, and deadline scheduling; CPU affinity; cgroups integration	`kernel/sched/`
Memory Manager	Virtual memory, paging, slab allocator, OOM killer, huge pages	`mm/`
VFS	Unified filesystem interface; supports ext4, xfs, btrfs, tmpfs, and 60+ others	`fs/`
Network Stack	TCP/IP, sockets, netfilter, XDP, eBPF integration	`net/`
Device Drivers	Block, char, PCI, USB, GPU, network, HID	`drivers/`
IPC	Signals, pipes, UNIX sockets, shared memory, futexes, io_uring	`ipc/`
Security	LSM framework: SELinux, AppArmor, seccomp, capabilities	`security/`
Architecture	Platform-specific boot, syscall entry, interrupt handling	`arch/`

Building the Kernel

Configuration

make menuconfig — interactive TUI config
make defconfig — safe defaults for your arch
make localmodconfig — trim to only loaded modules
Config stored in .config; ~10,000 options

Compilation

make -j$(nproc) — parallel build
Produces vmlinuz (compressed kernel image)
make modules_install — install .ko files
make install — copy to /boot, update grub

Release Cadence

New major release every ~10 weeks
LTS releases maintained 2–6 years (e.g., 5.15, 6.1, 6.6)
~70,000 commits per release from ~1,700 developers
Greg Kroah-Hartman manages stable / LTS branches

🔩

Kernel Subsystems

The major functional areas inside the Linux kernel and how they interact.

Process Scheduler (CFS)

kernel/sched/

The Completely Fair Scheduler uses a red-black tree ordered by virtual runtime. Each task accumulates vruntime; the task with the lowest vruntime runs next. It targets O(log n) scheduling decisions and adapts to CPU topology and cgroup hierarchies.

SCHED_NORMAL — general interactive tasks
SCHED_FIFO / SCHED_RR — real-time policies
SCHED_DEADLINE — EDF for hard real-time

Memory Management

mm/

Virtual memory is mapped through a 4–5 level page table hierarchy. The slab/slub allocator handles kernel object allocation. The OOM killer terminates processes when physical memory is exhausted.

Huge pages (2 MB / 1 GB) reduce TLB pressure
KSM (Kernel Samepage Merging) deduplicates RAM
ZRAM / ZSWAP — compressed swap in RAM

Virtual Filesystem (VFS)

fs/

VFS provides a uniform interface (open/read/write/close) over all filesystems. Every filesystem implements a set of function pointers (superblock_ops, inode_ops, file_ops). This is why you can cat a file on ext4, btrfs, or NFS with the same syscall.

ext4 — journaling, extents, most common
XFS — high-performance, large files
Btrfs — COW, snapshots, RAID
tmpfs — memory-backed (/tmp, /dev/shm)

Block I/O Layer

block/

All storage requests pass through the block layer, which merges and reorders I/O for efficiency. The multi-queue block layer (blk-mq) enables NVMe drives to fully exploit hundreds of hardware queues in parallel.

I/O schedulers: mq-deadline, kyber, bfq, none
Device mapper: LVM, dm-crypt, dm-multipath
MD RAID: software RAID 0/1/5/6/10

IPC & Synchronization

ipc/ · kernel/futex.c

Linux provides multiple IPC mechanisms. Futexes (fast userspace mutexes) are the foundation of pthreads. io_uring (5.1+) is the modern async I/O interface, enabling millions of IOPS without syscall overhead.

Pipes, FIFOs, UNIX domain sockets
POSIX shared memory, message queues, semaphores
Signals (kill, sigaction, signalfd)
io_uring — zero-copy, async, low-latency I/O

eBPF

kernel/bpf/

Extended Berkeley Packet Filter lets you run sandboxed programs inside the kernel without modifying kernel source or loading modules. Used by Cilium (Kubernetes networking), bcc/bpftrace (tracing), XDP (10M+ pps packet processing), and Meta/Google for production profiling.

Hook points: kprobes, tracepoints, sockets, XDP
JIT-compiled to native machine code
Verified by in-kernel verifier before loading

Subsystem Interactions

Syscall	Subsystems involved
`execve("./app")`	VFS (open binary), MM (mmap segments), Scheduler (create task), Security (LSM check)
`read(fd, buf, n)`	VFS → FS driver → Block layer → I/O scheduler → Driver → DMA
`malloc()`	libc → `brk()` or `mmap()` → MM (page fault → physical frame)
`send(sock, …)`	Socket → Network stack (TCP/IP) → netfilter → NIC driver → DMA
`fork()`	MM (COW page tables), Scheduler (new task), FD table copy, cgroup accounting

💡

Linux Concepts

The fundamental ideas every Linux user and developer needs to understand.

Everything is a File

Linux exposes hardware, processes, sockets, and kernel state as files in the filesystem. /proc/cpuinfo reads CPU info. /dev/sda is a disk. /sys/class/net/eth0/speed is a network interface property. This uniformity is why shell pipelines are so powerful.

Processes & Threads

Linux uses a single task_struct for both. Threads are processes that share memory (CLONE_VM). fork() creates a copy-on-write clone. exec() replaces the process image. Every task has a PID, PPID, UID, GID, and credential set.

File Descriptors

Every open file, socket, pipe, or device is an integer FD in the process's FD table. 0=stdin, 1=stdout, 2=stderr. FDs are inherited on fork, closed on exec (if O_CLOEXEC), and duped with dup2().

Permissions & Capabilities

Classic Unix: owner/group/other × read/write/execute. SUID/SGID bits elevate privilege on exec. Linux capabilities split root privilege into ~40 fine-grained rights: CAP_NET_ADMIN, CAP_SYS_PTRACE, CAP_CHOWN, etc.

Signals

Asynchronous notifications sent between processes or from the kernel. SIGKILL (9) cannot be caught or ignored. SIGTERM (15) requests graceful shutdown. SIGSEGV signals invalid memory access. SIGCHLD notifies parents of child state changes.

User Space vs Kernel Space

Processes run in ring 3 (user space) with restricted access. System calls cross into ring 0 (kernel space) via a software interrupt or SYSCALL instruction. This boundary is why batching syscalls (io_uring) and avoiding them (vDSO for gettimeofday) matters at scale.

Namespaces

Linux namespaces partition global resources. A process can have its own view of PIDs, network interfaces, mounts, users, hostnames, and cgroups. Containers are just processes in isolated namespaces — there's no magic, just clone() flags.

cgroups

Control groups limit and account for resource usage: CPU time, RAM, disk I/O, network bandwidth, and number of processes. systemd uses cgroups v2 to track every service. Docker and Kubernetes use them to enforce container resource limits.

The Boot Process

UEFI/BIOS → finds boot device
GRUB (bootloader) → loads kernel + initramfs
Kernel → mounts initramfs, runs init
systemd (PID 1) → activates units in parallel
Targets: sysinit → basic → multi-user → graphical

Important Files to Know

Path	Contents
`/proc/<pid>/maps`	Virtual memory layout of a process
`/proc/<pid>/fd/`	Symlinks to every open file descriptor
`/proc/sys/`	Tunable kernel parameters (sysctl)
`/sys/block/`	Block device attributes and queues
`/etc/passwd`, `/etc/shadow`	User accounts and hashed passwords
`/etc/fstab`	Filesystem mount table
`/var/log/`	System and application logs
`/boot/vmlinuz`	Compressed kernel image

GNU

The GNU Project — the free software ecosystem that became the foundation of Linux.

Key People & Projects

The GNU Project (1983)

Richard Stallman announced GNU on September 27, 1983, with a goal to build a completely free, Unix-compatible operating system. GNU stands for "GNU's Not Unix" — a recursive acronym. By the early 1990s, GNU had a complete userland but no kernel. Linux filled that gap.

Richard Stallman (RMS)

Stallman was an MIT AI Lab hacker who quit to pursue software freedom full-time. He authored GCC, Emacs, the GPL, and the GNU Manifesto. His philosophy — that software must be free to run, study, modify, and redistribute — became the moral backbone of open source.

The Free Software Foundation

Stallman founded the FSF in 1985 to support and fund GNU development. The FSF holds copyright on most GNU software, enforces GPL compliance, and maintains the Free Software Definition. It remains the ideological anchor of the free software movement today.

GPL — General Public License

The GPL is a copyleft license: any derivative work must also be distributed under the GPL. This "viral" property ensures that freedom propagates. GPLv2 governs the Linux kernel. GPLv3 (2007) added protections against Tivoization and patent retaliation.

GCC — GNU Compiler Collection

GCC started as a C compiler in 1987 and grew into a full compiler suite: C, C++, Fortran, Ada, Go, Rust (via GCC front-end), and more. It was the first serious free compiler and remains a cornerstone of Linux system builds. gcc -O2 -Wall -o prog prog.c

GNU Coreutils

The 100+ command-line utilities that form the Unix userland: ls, cp, mv, rm, cat, echo, sort, wc, head, tail, cut, tr, and dozens more. Every Linux system runs these. GNU versions add features beyond the POSIX minimum.

GNU Bash

The Bourne Again SHell — GNU's replacement for the original Bourne shell. Bash adds arrays, arithmetic, readline editing, history, job control, and process substitution. It's the default shell on almost every Linux distribution and the lingua franca of system scripting.

GNU Emacs

Stallman's original GNU program — an infinitely extensible text editor built on a Lisp interpreter. Emacs is simultaneously an editor, IDE, email client, file manager, and operating environment. The Emacs vs vi holy war has raged for 40 years without a winner.

glibc — GNU C Library

The C standard library for Linux. Every C program links against glibc for printf, malloc, fopen, POSIX threads, and the system call wrappers. Its ABI stability is legendary — programs compiled against glibc from 2001 still run on modern Linux.

GNU Make

The build automation tool that orchestrates compilation via Makefile rules. The Linux kernel itself is built with GNU Make. Make's dependency tracking ensures only changed files are recompiled. make -j$(nproc) parallelizes across all CPU cores.

GNU Binutils

The toolchain plumbing: ld (linker), as (assembler), objdump, nm, strip, ar, readelf. These tools link object files into executables, inspect binaries, and manage static libraries. Every compiled program on Linux passed through binutils.

GNU/Linux — the Naming Debate

Stallman insists the OS should be called "GNU/Linux" — Linux is only the kernel; the rest is GNU. Torvalds says "Linux" is fine. Most of the world says "Linux." The debate has been running since 1994. Both sides have a point; neither will budge. The OS ships either way.

Core GNU Tools at a Glance

Tool	Purpose	Key command
`gcc`	C/C++ compiler	`gcc -O2 -o prog prog.c`
`bash`	Shell / scripting	`bash -c 'echo hello'`
`make`	Build automation	`make -j$(nproc)`
`ld`	Linker	`ld -o prog main.o lib.a`
`gdb`	Debugger	`gdb ./prog core`
`grep`	Pattern search	`grep -rn 'TODO' src/`
`sed`	Stream editor	`sed 's/foo/bar/g' file`
`awk`	Text processing	`awk '{print $1}' log`
`tar`	Archiving	`tar -czf out.tar.gz dir/`
`emacs`	Editor / Lisp IDE	`emacs -nw file.txt`

🔐

SELinux

Security-Enhanced Linux — mandatory access control baked into the kernel.

What Is SELinux?

SELinux is a Mandatory Access Control (MAC) system implemented as a Linux Security Module (LSM). Originally developed by the NSA and released in 2000. Every process and file has a security context; policy rules define what contexts can interact with what. If a rule doesn't explicitly allow an action, it's denied.

DAC vs MAC

Traditional Unix uses Discretionary Access Control (DAC) — the file owner controls permissions. With MAC (SELinux), even root can be blocked if the policy doesn't allow it. A compromised nginx process can't read /etc/shadow even as root — the kernel enforces the policy.

Security Contexts

Every object has a label: user:role:type:level. Example: system_u:system_r:httpd_t:s0 — the Apache process. Files served by Apache might be httpd_sys_content_t. Policy says httpd_t can read httpd_sys_content_t. Wrong label? Access denied.

Operating Modes

Enforcing — policy violations are blocked and logged
Permissive — violations are logged only (for debugging)
Disabled — SELinux is off (requires reboot to re-enable)
getenforce / setenforce 0 to toggle enforcing↔permissive

Essential Commands

ls -Z — show file security context
ps -eZ — show process security context
restorecon -Rv /var/www — relabel files to policy default
chcon -t httpd_sys_content_t file — change context
audit2why < /var/log/audit/audit.log — explain denials
audit2allow — generate policy from denials

Policy Types

targeted — only specific daemons are confined; default on RHEL/Fedora
strict — all processes are confined
mls — Multi-Level Security for classified environments
Policies compiled to binary, loaded into kernel at boot

Booleans

SELinux booleans let you toggle predefined policy behaviors without writing new policy. setsebool -P httpd_can_network_connect on allows Apache to make outbound network connections. getsebool -a lists all booleans and their state.

Real-World Impact

SELinux has contained numerous CVEs in the wild. A buffer overflow in Apache that would normally give root shell instead gets stopped — the attacker has httpd_t context, which can't write to /etc, spawn unrestricted shells, or escalate. The vulnerability exists; the damage is contained.

AppArmor (Alternative)

AppArmor is another LSM used on Ubuntu/Debian. Instead of labels on every file, it uses path-based profiles per application. Easier to write policies, less powerful. Profiles in /etc/apparmor.d/. Both SELinux and AppArmor implement the same LSM hooks.

Troubleshooting SELinux Denials

Step	Command	Purpose
1. Check mode	`getenforce`	Confirm SELinux is Enforcing
2. Find denial	`ausearch -m AVC -ts recent`	Show recent access denials from audit log
3. Explain it	`audit2why < /var/log/audit/audit.log`	Human-readable explanation of the denial
4. Check labels	`ls -Z /path`	Verify the file has the expected context
5. Relabel	`restorecon -Rv /path`	Reset labels to policy defaults
6. Toggle boolean	`getsebool -a \| grep <feature>`	Check if a boolean covers your use case
7. Generate policy	`audit2allow -M mypol < /var/log/audit/audit.log`	Create a custom allow policy module

🖥️

Virtualization

Running full operating systems inside other operating systems — and why Linux dominates the hypervisor world.

KVM — Kernel-based Virtual Machine

Type 1 hypervisor, built into the kernel

KVM turns the Linux kernel itself into a Type 1 hypervisor. Intel VT-x or AMD-V CPU extensions are required. Each VM runs as a regular Linux process, but with hardware-isolated memory and virtualized CPU. AWS EC2, Google Cloud, and most of the internet runs on KVM.

lsmod | grep kvm — check KVM is loaded
VM processes visible via ps -ef | grep qemu
Memory backed by anonymous mmap in host

QEMU

User-space machine emulator and virtualizer

QEMU emulates complete hardware (CPU, disk, NIC, USB). Paired with KVM, QEMU handles device emulation while KVM handles CPU/memory virtualization at near-native speed. QEMU alone can emulate different CPU architectures — run ARM binaries on x86.

qemu-system-x86_64 -enable-kvm -m 4G -hda disk.img
virtio drivers for near-native NIC/disk performance
SPICE/VNC for VM display

libvirt & virt-manager

Management layer over KVM/QEMU

libvirt provides a unified API for managing VMs, storage, and networks across KVM, Xen, and VMware. virsh is the CLI. virt-manager is the GUI. Both are how most Linux admins interact with VMs on-premise.

virsh list --all — list all VMs
virsh start myvm / virsh shutdown myvm
virsh snapshot-create-as myvm snap1

Xen

Type 1 hypervisor, runs beneath Linux

Xen runs below the OS: Linux boots as "dom0" (privileged driver domain), guest VMs are "domU". AWS originally ran entirely on Xen before migrating to KVM/Nitro. Xen is still used in Qubes OS for strong security isolation.

Hardware Virtualization Features

Intel VT-x / AMD-V — trap-and-emulate privileged instructions
Intel EPT / AMD NPT — hardware nested page tables
Intel VT-d / AMD-Vi — IOMMU for PCIe passthrough
SR-IOV — one NIC presents as many virtual functions
egrep 'vmx|svm' /proc/cpuinfo to check support

VM Disk Formats

qcow2 — QEMU native; copy-on-write, snapshots, compression
raw — no overhead, maximum I/O performance
VMDK — VMware format (QEMU can read/write)
VHD/VHDX — Microsoft Hyper-V format
qemu-img convert -f raw -O qcow2 disk.raw disk.qcow2

Live Migration

KVM supports live migration: a running VM is transferred from one host to another with milliseconds of downtime. Memory pages are copied iteratively; final dirty pages are sent during a brief pause. Used by every cloud provider for hardware maintenance with zero user impact.

Type 1 vs Type 2

Type	Examples	Notes
Type 1 (bare metal)	KVM, Xen, ESXi	Runs directly on hardware; production use
Type 2 (hosted)	VirtualBox, VMware Workstation	Runs inside a host OS; developer use

📦

Containerization

Lightweight isolation using namespaces and cgroups — no hypervisor required.

How Containers Work

Containers are not VMs. They're Linux processes with isolated namespaces (so they can't see other processes, network interfaces, or filesystems) and cgroup limits (so they can't starve other workloads). They share the host kernel — no hypervisor, no emulation, near-native performance.

Linux Namespaces

CLONE_NEWPID — isolated PID space (container PID 1)
CLONE_NEWNET — private network stack
CLONE_NEWMNT — isolated filesystem mount tree
CLONE_NEWUTS — independent hostname
CLONE_NEWUSER — UID/GID mapping (rootless)
CLONE_NEWIPC — isolated IPC objects
CLONE_NEWCGROUP — isolated cgroup view

Control Groups (cgroups)

cpu — CPU share and quota (CFS bandwidth control)
memory — RAM limit + OOM killer per container
blkio — disk I/O throttling
net_cls — traffic classification
cgroups v2 unifies the hierarchy; systemd uses it exclusively

Container Images

OCI image format: a stack of read-only layers (tarballs) plus a JSON config. Each RUN in a Dockerfile adds a layer. At runtime, layers are union-mounted (overlayfs) and a writable layer is placed on top. The image is just a filesystem — the "magic" is the runtime calling clone() and chroot().

Docker

The container developer experience

docker build -t myapp . — build image from Dockerfile
docker run -d -p 8080:80 myapp — run detached
docker exec -it <id> bash — shell into container
docker compose up -d — multi-container apps
Docker Engine → containerd → runc (OCI runtime)

Container Runtimes

runc — OCI reference runtime; creates containers from the spec
containerd — daemon managing image pull, storage, runc lifecycle
CRI-O — minimal Kubernetes CRI runtime
Podman — Docker-compatible, daemonless, rootless by default
gVisor — sandboxed kernel in user space (Google)

Kubernetes

Kubernetes orchestrates containers across clusters. It schedules pods (groups of containers) onto nodes, manages service discovery, rolling deployments, auto-scaling, and self-healing. The control plane (API server, etcd, scheduler, controller manager) runs separately from worker nodes.

Security Considerations

Drop capabilities: --cap-drop ALL --cap-add NET_BIND_SERVICE
Read-only root fs: --read-only
Rootless containers (Podman, Docker rootless)
seccomp profiles block dangerous syscalls
SELinux/AppArmor profiles per container
Never run --privileged in production

Containers vs VMs

	Containers	VMs
Kernel	Shared with host	Own kernel per VM
Startup time	Milliseconds	Seconds to minutes
Image size	MB	GB
Performance	Near-native	5–10% overhead (KVM)
Isolation	Namespace-level	Hardware-level
Best for	Microservices, CI/CD, dev environments	Multi-tenant security, different OS, legacy apps

🌐

Linux Networking

From kernel socket buffers to iptables to eBPF — how Linux moves packets.

The Network Stack

Linux implements a full TCP/IP stack in the kernel. Data flows: NIC → driver → softirq → netif_receive_skb → protocol handlers (IP → TCP/UDP) → socket buffer → recv() syscall → user space. Each layer can be hooked by netfilter, tc, or eBPF.

Network Interfaces

ip link show — list all interfaces
ip addr add 10.0.0.1/24 dev eth0 — assign IP
ip route show — routing table
ip -s link — RX/TX statistics
Loopback (lo), Ethernet (eth0), WiFi (wlan0), bridge (br0), veth pairs

iptables / nftables

iptables is the classic Linux firewall and NAT tool, operating on netfilter hooks. Tables (filter, nat, mangle, raw) contain chains (INPUT, OUTPUT, FORWARD, PREROUTING, POSTROUTING) of rules processed in order. nftables is the modern replacement.

iptables -L -n -v — list all rules with stats
iptables -A INPUT -p tcp --dport 443 -j ACCEPT
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

Socket Tuning

net.core.somaxconn — listen backlog size
net.ipv4.tcp_tw_reuse — reuse TIME_WAIT sockets
net.core.rmem_max / wmem_max — max socket buffer
net.ipv4.ip_local_port_range — ephemeral port range
sysctl -w net.core.somaxconn=65535

DNS & Name Resolution

/etc/resolv.conf — nameserver list
/etc/hosts — static hostname overrides
/etc/nsswitch.conf — resolution order
dig example.com / resolvectl query example.com
systemd-resolved provides a local caching DNS stub

Network Namespaces

Each network namespace has its own interfaces, routes, iptables rules, and sockets. Containers use veth pairs — one end in the container namespace, one in a host bridge (docker0). ip netns exec mynamespace ip addr runs commands inside a namespace.

XDP & eBPF Networking

XDP (eXpress Data Path) attaches eBPF programs at the earliest point in the receive path — before the kernel allocates an sk_buff. This enables packet processing at 100 Gb/s+ line rate. Used by Cloudflare for DDoS mitigation and Facebook for load balancing at scale.

Useful Diagnostics

ss -tulnp — listening sockets and owning processes
tcpdump -i eth0 port 443 — capture traffic
traceroute / mtr — path and latency to host
ethtool eth0 — NIC speed, duplex, offload features
iperf3 -s / iperf3 -c host — bandwidth test

Common Networking Stacks by Use Case

Use Case	Stack
Web server	NIC → kernel TCP → nginx/Apache (epoll) → TLS → HTTP
Container networking	veth pair → bridge → iptables NAT → physical NIC
Kubernetes service	iptables / eBPF (Cilium) load balancing across pod IPs
VPN	WireGuard (kernel module) / OpenVPN (tun device)
DDoS mitigation	XDP eBPF program → drop malicious packets before sk_buff alloc

🔒

SSH Tunnels

Forward ports securely through an encrypted SSH channel — local, remote, and dynamic modes.

Local Tunnel (-L)

Binds a port on your local machine and forwards connections through the SSH server to a destination. Traffic originates from the SSH server's perspective.

ssh -L 8080:localhost:80 user@remote — local :8080 → remote :80
ssh -L 5432:db.internal:5432 user@bastion — reach internal DB via bastion
ssh -L 0.0.0.0:8080:target:80 user@remote — bind all interfaces locally
-N — no remote command, tunnel only
-f — go to background after authenticating

Remote Tunnel (-R)

Binds a port on the remote SSH server and forwards connections back to your local machine (or any host reachable from it). Useful for exposing a local service through a public server.

ssh -R 9090:localhost:3000 user@remote — remote :9090 → local :3000
ssh -R 0.0.0.0:80:localhost:8080 user@remote — publicly expose local :8080
Requires GatewayPorts yes in sshd_config to bind non-loopback on remote
Common use: demo a dev server behind NAT/firewall to the public internet

Dynamic / SOCKS Tunnel (-D)

Creates a local SOCKS4/5 proxy. Your SSH client acts as a dynamic port forwarder — any application that supports a SOCKS proxy can route traffic through the SSH server.

ssh -D 1080 user@remote — SOCKS proxy on localhost:1080
Configure browser proxy settings to socks5://127.0.0.1:1080
curl --socks5 127.0.0.1:1080 http://target
All traffic exits from the SSH server's IP — effective for browsing or testing geo-restricted content

Local vs Remote — Key Difference

Local (-L): you open a local port → traffic goes out through the remote.

Remote (-R): the remote opens a port → traffic comes back to you.

Think of it as which side initiates the listening socket.

Persistent Tunnels with autossh

autossh monitors the tunnel and restarts it if the connection drops. Ideal for long-lived remote tunnels on servers.

autossh -M 0 -N -R 9090:localhost:3000 user@remote
-M 0 — disable autossh echo port, rely on SSH keepalives instead
Add ServerAliveInterval 30 and ServerAliveCountMax 3 to ~/.ssh/config
Run as a systemd service for boot-time persistence

~/.ssh/config Shortcuts

Define tunnels in your SSH config so you don't have to type them every time.

LocalForward 8080 localhost:80
RemoteForward 9090 localhost:3000
DynamicForward 1080
ExitOnForwardFailure yes — fail if port can't be bound
Combine with ControlMaster auto for connection reuse

Jump Hosts (-J)

SSH through one or more intermediate hosts to reach a target that isn't directly reachable. Replaces the older ProxyCommand pattern.

ssh -J bastion user@internal-host
ssh -J hop1,hop2 user@target — chain multiple hops
In config: ProxyJump bastion.example.com
Combine with -L or -R to tunnel through the jump chain

Security Considerations

Tunnels bypass firewalls — only permit on trusted hosts
AllowTcpForwarding no in sshd_config disables all forwarding
PermitOpen host:port restricts which destinations are allowed
Use dedicated low-privilege accounts for tunnel-only access
Combine with ForceCommand /bin/false to disallow shell access on tunnel accounts

Tunnel Quick Reference

Flag	Who Listens	Traffic Direction	Common Use
`-L local:host:remote`	Local machine	Local → Remote destination	Access internal services from your laptop
`-R remote:host:local`	Remote server	Remote → Local destination	Expose local dev server to the internet
`-D port`	Local machine (SOCKS)	Dynamic — any destination	Ad-hoc proxy / bypass restrictions
`-J jumphost`	N/A (proxy hop)	Through bastion to target	Reach hosts in private networks