The major distros — history, philosophy, and what makes each unique.
Born December 28, 1969 · Helsinki, Finland
Linus Benedict Torvalds is one of the most consequential engineers in the history of computing. At 21 years old, while studying at the University of Helsinki, he wrote the first version of the Linux kernel in his bedroom — and casually announced it on the comp.os.minix Usenet group as "just a hobby, won't be big and professional like gnu."
That hobby now runs 97% of the world's top web servers, all 500 of the TOP500 supercomputers, every Mars rover, the New York Stock Exchange, and over 3 billion Android devices. Linux is the single most widely deployed operating system in history — and the largest collaborative software project ever created.
In 2005, frustrated with the proprietary BitKeeper VCS, Torvalds wrote Git from scratch in roughly two weeks. Git became the world's dominant version control system, now used by virtually every developer on the planet through GitHub, GitLab, and Bitbucket.
Known for his blunt communication style, Torvalds famously called bad code "complete and utter garbage," told NVIDIA they were "the single worst company" he'd ever dealt with (on camera), and has maintained the kernel's technical integrity through 30+ years of relentless contribution — sometimes accepting patches at 3am, sometimes rejecting them with colorful language that became internet legend.
In 2018 he took a rare break to work on his communication style. He returned. The kernel kept shipping. The man is, by any measure, an engineering legend.
Monolithic, modular, and relentlessly optimized — the core of every Linux system.
The kernel runs entirely in a single address space (ring 0). All subsystems — scheduling, memory, drivers, filesystems — share the same memory and can call each other directly with no context-switch overhead. This is why Linux is fast.
Despite being monolithic, Linux supports dynamically loading and unloading modules (.ko files) at runtime. Device drivers, filesystems, and network protocols can be added without rebooting. lsmod, insmod, rmmod.
The kernel can preempt even kernel-mode code (with CONFIG_PREEMPT). PREEMPT_RT patches push this further, turning Linux into a real-time OS used in industrial control systems, audio production, and robotics.
Linux scales from a single-core embedded CPU to a 6000-core supercomputer node. NUMA-aware memory allocation, per-CPU data structures, and lock-free algorithms make it competitive on the largest x86, ARM, and RISC-V machines.
Since Linux 6.1 (2022), Rust is a first-class language alongside C. Rust's ownership model eliminates whole classes of memory-safety bugs. Drivers and subsystems are gradually being written or rewritten in Rust.
User space communicates with the kernel via ~400 system calls: read, write, mmap, clone, execve, ioctl. The ABI is stable — a binary from 1995 can still run on Linux 6.x.
| Layer | Responsibility | Source Dir |
|---|---|---|
| Process Scheduler | CFS, RT, and deadline scheduling; CPU affinity; cgroups integration | kernel/sched/ |
| Memory Manager | Virtual memory, paging, slab allocator, OOM killer, huge pages | mm/ |
| VFS | Unified filesystem interface; supports ext4, xfs, btrfs, tmpfs, and 60+ others | fs/ |
| Network Stack | TCP/IP, sockets, netfilter, XDP, eBPF integration | net/ |
| Device Drivers | Block, char, PCI, USB, GPU, network, HID | drivers/ |
| IPC | Signals, pipes, UNIX sockets, shared memory, futexes, io_uring | ipc/ |
| Security | LSM framework: SELinux, AppArmor, seccomp, capabilities | security/ |
| Architecture | Platform-specific boot, syscall entry, interrupt handling | arch/ |
make menuconfig — interactive TUI configmake defconfig — safe defaults for your archmake localmodconfig — trim to only loaded modules.config; ~10,000 optionsmake -j$(nproc) — parallel buildvmlinuz (compressed kernel image)make modules_install — install .ko filesmake install — copy to /boot, update grubThe major functional areas inside the Linux kernel and how they interact.
The Completely Fair Scheduler uses a red-black tree ordered by virtual runtime. Each task accumulates vruntime; the task with the lowest vruntime runs next. It targets O(log n) scheduling decisions and adapts to CPU topology and cgroup hierarchies.
Virtual memory is mapped through a 4–5 level page table hierarchy. The slab/slub allocator handles kernel object allocation. The OOM killer terminates processes when physical memory is exhausted.
VFS provides a uniform interface (open/read/write/close) over all filesystems. Every filesystem implements a set of function pointers (superblock_ops, inode_ops, file_ops). This is why you can cat a file on ext4, btrfs, or NFS with the same syscall.
All storage requests pass through the block layer, which merges and reorders I/O for efficiency. The multi-queue block layer (blk-mq) enables NVMe drives to fully exploit hundreds of hardware queues in parallel.
Linux provides multiple IPC mechanisms. Futexes (fast userspace mutexes) are the foundation of pthreads. io_uring (5.1+) is the modern async I/O interface, enabling millions of IOPS without syscall overhead.
Extended Berkeley Packet Filter lets you run sandboxed programs inside the kernel without modifying kernel source or loading modules. Used by Cilium (Kubernetes networking), bcc/bpftrace (tracing), XDP (10M+ pps packet processing), and Meta/Google for production profiling.
| Syscall | Subsystems involved |
|---|---|
execve("./app") | VFS (open binary), MM (mmap segments), Scheduler (create task), Security (LSM check) |
read(fd, buf, n) | VFS → FS driver → Block layer → I/O scheduler → Driver → DMA |
malloc() | libc → brk() or mmap() → MM (page fault → physical frame) |
send(sock, …) | Socket → Network stack (TCP/IP) → netfilter → NIC driver → DMA |
fork() | MM (COW page tables), Scheduler (new task), FD table copy, cgroup accounting |
The fundamental ideas every Linux user and developer needs to understand.
Linux exposes hardware, processes, sockets, and kernel state as files in the filesystem. /proc/cpuinfo reads CPU info. /dev/sda is a disk. /sys/class/net/eth0/speed is a network interface property. This uniformity is why shell pipelines are so powerful.
Linux uses a single task_struct for both. Threads are processes that share memory (CLONE_VM). fork() creates a copy-on-write clone. exec() replaces the process image. Every task has a PID, PPID, UID, GID, and credential set.
Every open file, socket, pipe, or device is an integer FD in the process's FD table. 0=stdin, 1=stdout, 2=stderr. FDs are inherited on fork, closed on exec (if O_CLOEXEC), and duped with dup2().
Classic Unix: owner/group/other × read/write/execute. SUID/SGID bits elevate privilege on exec. Linux capabilities split root privilege into ~40 fine-grained rights: CAP_NET_ADMIN, CAP_SYS_PTRACE, CAP_CHOWN, etc.
Asynchronous notifications sent between processes or from the kernel. SIGKILL (9) cannot be caught or ignored. SIGTERM (15) requests graceful shutdown. SIGSEGV signals invalid memory access. SIGCHLD notifies parents of child state changes.
Processes run in ring 3 (user space) with restricted access. System calls cross into ring 0 (kernel space) via a software interrupt or SYSCALL instruction. This boundary is why batching syscalls (io_uring) and avoiding them (vDSO for gettimeofday) matters at scale.
Linux namespaces partition global resources. A process can have its own view of PIDs, network interfaces, mounts, users, hostnames, and cgroups. Containers are just processes in isolated namespaces — there's no magic, just clone() flags.
Control groups limit and account for resource usage: CPU time, RAM, disk I/O, network bandwidth, and number of processes. systemd uses cgroups v2 to track every service. Docker and Kubernetes use them to enforce container resource limits.
init| Path | Contents |
|---|---|
/proc/<pid>/maps | Virtual memory layout of a process |
/proc/<pid>/fd/ | Symlinks to every open file descriptor |
/proc/sys/ | Tunable kernel parameters (sysctl) |
/sys/block/ | Block device attributes and queues |
/etc/passwd, /etc/shadow | User accounts and hashed passwords |
/etc/fstab | Filesystem mount table |
/var/log/ | System and application logs |
/boot/vmlinuz | Compressed kernel image |

The GNU Project — the free software ecosystem that became the foundation of Linux.
Richard Stallman announced GNU on September 27, 1983, with a goal to build a completely free, Unix-compatible operating system. GNU stands for "GNU's Not Unix" — a recursive acronym. By the early 1990s, GNU had a complete userland but no kernel. Linux filled that gap.
Stallman was an MIT AI Lab hacker who quit to pursue software freedom full-time. He authored GCC, Emacs, the GPL, and the GNU Manifesto. His philosophy — that software must be free to run, study, modify, and redistribute — became the moral backbone of open source.
Stallman founded the FSF in 1985 to support and fund GNU development. The FSF holds copyright on most GNU software, enforces GPL compliance, and maintains the Free Software Definition. It remains the ideological anchor of the free software movement today.
The GPL is a copyleft license: any derivative work must also be distributed under the GPL. This "viral" property ensures that freedom propagates. GPLv2 governs the Linux kernel. GPLv3 (2007) added protections against Tivoization and patent retaliation.
GCC started as a C compiler in 1987 and grew into a full compiler suite: C, C++, Fortran, Ada, Go, Rust (via GCC front-end), and more. It was the first serious free compiler and remains a cornerstone of Linux system builds. gcc -O2 -Wall -o prog prog.c
The 100+ command-line utilities that form the Unix userland: ls, cp, mv, rm, cat, echo, sort, wc, head, tail, cut, tr, and dozens more. Every Linux system runs these. GNU versions add features beyond the POSIX minimum.
The Bourne Again SHell — GNU's replacement for the original Bourne shell. Bash adds arrays, arithmetic, readline editing, history, job control, and process substitution. It's the default shell on almost every Linux distribution and the lingua franca of system scripting.
Stallman's original GNU program — an infinitely extensible text editor built on a Lisp interpreter. Emacs is simultaneously an editor, IDE, email client, file manager, and operating environment. The Emacs vs vi holy war has raged for 40 years without a winner.
The C standard library for Linux. Every C program links against glibc for printf, malloc, fopen, POSIX threads, and the system call wrappers. Its ABI stability is legendary — programs compiled against glibc from 2001 still run on modern Linux.
The build automation tool that orchestrates compilation via Makefile rules. The Linux kernel itself is built with GNU Make. Make's dependency tracking ensures only changed files are recompiled. make -j$(nproc) parallelizes across all CPU cores.
The toolchain plumbing: ld (linker), as (assembler), objdump, nm, strip, ar, readelf. These tools link object files into executables, inspect binaries, and manage static libraries. Every compiled program on Linux passed through binutils.
Stallman insists the OS should be called "GNU/Linux" — Linux is only the kernel; the rest is GNU. Torvalds says "Linux" is fine. Most of the world says "Linux." The debate has been running since 1994. Both sides have a point; neither will budge. The OS ships either way.
| Tool | Purpose | Key command |
|---|---|---|
gcc | C/C++ compiler | gcc -O2 -o prog prog.c |
bash | Shell / scripting | bash -c 'echo hello' |
make | Build automation | make -j$(nproc) |
ld | Linker | ld -o prog main.o lib.a |
gdb | Debugger | gdb ./prog core |
grep | Pattern search | grep -rn 'TODO' src/ |
sed | Stream editor | sed 's/foo/bar/g' file |
awk | Text processing | awk '{print $1}' log |
tar | Archiving | tar -czf out.tar.gz dir/ |
emacs | Editor / Lisp IDE | emacs -nw file.txt |
Security-Enhanced Linux — mandatory access control baked into the kernel.
SELinux is a Mandatory Access Control (MAC) system implemented as a Linux Security Module (LSM). Originally developed by the NSA and released in 2000. Every process and file has a security context; policy rules define what contexts can interact with what. If a rule doesn't explicitly allow an action, it's denied.
Traditional Unix uses Discretionary Access Control (DAC) — the file owner controls permissions. With MAC (SELinux), even root can be blocked if the policy doesn't allow it. A compromised nginx process can't read /etc/shadow even as root — the kernel enforces the policy.
Every object has a label: user:role:type:level. Example: system_u:system_r:httpd_t:s0 — the Apache process. Files served by Apache might be httpd_sys_content_t. Policy says httpd_t can read httpd_sys_content_t. Wrong label? Access denied.
getenforce / setenforce 0 to toggle enforcing↔permissivels -Z — show file security contextps -eZ — show process security contextrestorecon -Rv /var/www — relabel files to policy defaultchcon -t httpd_sys_content_t file — change contextaudit2why < /var/log/audit/audit.log — explain denialsaudit2allow — generate policy from denialsSELinux booleans let you toggle predefined policy behaviors without writing new policy. setsebool -P httpd_can_network_connect on allows Apache to make outbound network connections. getsebool -a lists all booleans and their state.
SELinux has contained numerous CVEs in the wild. A buffer overflow in Apache that would normally give root shell instead gets stopped — the attacker has httpd_t context, which can't write to /etc, spawn unrestricted shells, or escalate. The vulnerability exists; the damage is contained.
AppArmor is another LSM used on Ubuntu/Debian. Instead of labels on every file, it uses path-based profiles per application. Easier to write policies, less powerful. Profiles in /etc/apparmor.d/. Both SELinux and AppArmor implement the same LSM hooks.
| Step | Command | Purpose |
|---|---|---|
| 1. Check mode | getenforce | Confirm SELinux is Enforcing |
| 2. Find denial | ausearch -m AVC -ts recent | Show recent access denials from audit log |
| 3. Explain it | audit2why < /var/log/audit/audit.log | Human-readable explanation of the denial |
| 4. Check labels | ls -Z /path | Verify the file has the expected context |
| 5. Relabel | restorecon -Rv /path | Reset labels to policy defaults |
| 6. Toggle boolean | getsebool -a | grep <feature> | Check if a boolean covers your use case |
| 7. Generate policy | audit2allow -M mypol < /var/log/audit/audit.log | Create a custom allow policy module |
Running full operating systems inside other operating systems — and why Linux dominates the hypervisor world.
KVM turns the Linux kernel itself into a Type 1 hypervisor. Intel VT-x or AMD-V CPU extensions are required. Each VM runs as a regular Linux process, but with hardware-isolated memory and virtualized CPU. AWS EC2, Google Cloud, and most of the internet runs on KVM.
lsmod | grep kvm — check KVM is loadedps -ef | grep qemuQEMU emulates complete hardware (CPU, disk, NIC, USB). Paired with KVM, QEMU handles device emulation while KVM handles CPU/memory virtualization at near-native speed. QEMU alone can emulate different CPU architectures — run ARM binaries on x86.
qemu-system-x86_64 -enable-kvm -m 4G -hda disk.imglibvirt provides a unified API for managing VMs, storage, and networks across KVM, Xen, and VMware. virsh is the CLI. virt-manager is the GUI. Both are how most Linux admins interact with VMs on-premise.
virsh list --all — list all VMsvirsh start myvm / virsh shutdown myvmvirsh snapshot-create-as myvm snap1Xen runs below the OS: Linux boots as "dom0" (privileged driver domain), guest VMs are "domU". AWS originally ran entirely on Xen before migrating to KVM/Nitro. Xen is still used in Qubes OS for strong security isolation.
egrep 'vmx|svm' /proc/cpuinfo to check supportqemu-img convert -f raw -O qcow2 disk.raw disk.qcow2KVM supports live migration: a running VM is transferred from one host to another with milliseconds of downtime. Memory pages are copied iteratively; final dirty pages are sent during a brief pause. Used by every cloud provider for hardware maintenance with zero user impact.
| Type | Examples | Notes |
|---|---|---|
| Type 1 (bare metal) | KVM, Xen, ESXi | Runs directly on hardware; production use |
| Type 2 (hosted) | VirtualBox, VMware Workstation | Runs inside a host OS; developer use |
Lightweight isolation using namespaces and cgroups — no hypervisor required.
Containers are not VMs. They're Linux processes with isolated namespaces (so they can't see other processes, network interfaces, or filesystems) and cgroup limits (so they can't starve other workloads). They share the host kernel — no hypervisor, no emulation, near-native performance.
CLONE_NEWPID — isolated PID space (container PID 1)CLONE_NEWNET — private network stackCLONE_NEWMNT — isolated filesystem mount treeCLONE_NEWUTS — independent hostnameCLONE_NEWUSER — UID/GID mapping (rootless)CLONE_NEWIPC — isolated IPC objectsCLONE_NEWCGROUP — isolated cgroup viewOCI image format: a stack of read-only layers (tarballs) plus a JSON config. Each RUN in a Dockerfile adds a layer. At runtime, layers are union-mounted (overlayfs) and a writable layer is placed on top. The image is just a filesystem — the "magic" is the runtime calling clone() and chroot().
docker build -t myapp . — build image from Dockerfiledocker run -d -p 8080:80 myapp — run detacheddocker exec -it <id> bash — shell into containerdocker compose up -d — multi-container appsKubernetes orchestrates containers across clusters. It schedules pods (groups of containers) onto nodes, manages service discovery, rolling deployments, auto-scaling, and self-healing. The control plane (API server, etcd, scheduler, controller manager) runs separately from worker nodes.
--cap-drop ALL --cap-add NET_BIND_SERVICE--read-only--privileged in production| Containers | VMs | |
|---|---|---|
| Kernel | Shared with host | Own kernel per VM |
| Startup time | Milliseconds | Seconds to minutes |
| Image size | MB | GB |
| Performance | Near-native | 5–10% overhead (KVM) |
| Isolation | Namespace-level | Hardware-level |
| Best for | Microservices, CI/CD, dev environments | Multi-tenant security, different OS, legacy apps |
From kernel socket buffers to iptables to eBPF — how Linux moves packets.
Linux implements a full TCP/IP stack in the kernel. Data flows: NIC → driver → softirq → netif_receive_skb → protocol handlers (IP → TCP/UDP) → socket buffer → recv() syscall → user space. Each layer can be hooked by netfilter, tc, or eBPF.
ip link show — list all interfacesip addr add 10.0.0.1/24 dev eth0 — assign IPip route show — routing tableip -s link — RX/TX statisticslo), Ethernet (eth0), WiFi (wlan0), bridge (br0), veth pairsiptables is the classic Linux firewall and NAT tool, operating on netfilter hooks. Tables (filter, nat, mangle, raw) contain chains (INPUT, OUTPUT, FORWARD, PREROUTING, POSTROUTING) of rules processed in order. nftables is the modern replacement.
iptables -L -n -v — list all rules with statsiptables -A INPUT -p tcp --dport 443 -j ACCEPTiptables -t nat -A POSTROUTING -o eth0 -j MASQUERADEnet.core.somaxconn — listen backlog sizenet.ipv4.tcp_tw_reuse — reuse TIME_WAIT socketsnet.core.rmem_max / wmem_max — max socket buffernet.ipv4.ip_local_port_range — ephemeral port rangesysctl -w net.core.somaxconn=65535/etc/resolv.conf — nameserver list/etc/hosts — static hostname overrides/etc/nsswitch.conf — resolution orderdig example.com / resolvectl query example.comEach network namespace has its own interfaces, routes, iptables rules, and sockets. Containers use veth pairs — one end in the container namespace, one in a host bridge (docker0). ip netns exec mynamespace ip addr runs commands inside a namespace.
XDP (eXpress Data Path) attaches eBPF programs at the earliest point in the receive path — before the kernel allocates an sk_buff. This enables packet processing at 100 Gb/s+ line rate. Used by Cloudflare for DDoS mitigation and Facebook for load balancing at scale.
ss -tulnp — listening sockets and owning processestcpdump -i eth0 port 443 — capture traffictraceroute / mtr — path and latency to hostethtool eth0 — NIC speed, duplex, offload featuresiperf3 -s / iperf3 -c host — bandwidth test| Use Case | Stack |
|---|---|
| Web server | NIC → kernel TCP → nginx/Apache (epoll) → TLS → HTTP |
| Container networking | veth pair → bridge → iptables NAT → physical NIC |
| Kubernetes service | iptables / eBPF (Cilium) load balancing across pod IPs |
| VPN | WireGuard (kernel module) / OpenVPN (tun device) |
| DDoS mitigation | XDP eBPF program → drop malicious packets before sk_buff alloc |
Forward ports securely through an encrypted SSH channel — local, remote, and dynamic modes.
Binds a port on your local machine and forwards connections through the SSH server to a destination. Traffic originates from the SSH server's perspective.
ssh -L 8080:localhost:80 user@remote — local :8080 → remote :80ssh -L 5432:db.internal:5432 user@bastion — reach internal DB via bastionssh -L 0.0.0.0:8080:target:80 user@remote — bind all interfaces locally-N — no remote command, tunnel only-f — go to background after authenticatingBinds a port on the remote SSH server and forwards connections back to your local machine (or any host reachable from it). Useful for exposing a local service through a public server.
ssh -R 9090:localhost:3000 user@remote — remote :9090 → local :3000ssh -R 0.0.0.0:80:localhost:8080 user@remote — publicly expose local :8080GatewayPorts yes in sshd_config to bind non-loopback on remoteCreates a local SOCKS4/5 proxy. Your SSH client acts as a dynamic port forwarder — any application that supports a SOCKS proxy can route traffic through the SSH server.
ssh -D 1080 user@remote — SOCKS proxy on localhost:1080socks5://127.0.0.1:1080curl --socks5 127.0.0.1:1080 http://targetLocal (-L): you open a local port → traffic goes out through the remote.
Remote (-R): the remote opens a port → traffic comes back to you.
Think of it as which side initiates the listening socket.
autossh monitors the tunnel and restarts it if the connection drops. Ideal for long-lived remote tunnels on servers.
autossh -M 0 -N -R 9090:localhost:3000 user@remote-M 0 — disable autossh echo port, rely on SSH keepalives insteadServerAliveInterval 30 and ServerAliveCountMax 3 to ~/.ssh/configDefine tunnels in your SSH config so you don't have to type them every time.
LocalForward 8080 localhost:80RemoteForward 9090 localhost:3000DynamicForward 1080ExitOnForwardFailure yes — fail if port can't be boundControlMaster auto for connection reuseSSH through one or more intermediate hosts to reach a target that isn't directly reachable. Replaces the older ProxyCommand pattern.
ssh -J bastion user@internal-hostssh -J hop1,hop2 user@target — chain multiple hopsProxyJump bastion.example.com-L or -R to tunnel through the jump chainAllowTcpForwarding no in sshd_config disables all forwardingPermitOpen host:port restricts which destinations are allowedForceCommand /bin/false to disallow shell access on tunnel accounts| Flag | Who Listens | Traffic Direction | Common Use |
|---|---|---|---|
-L local:host:remote | Local machine | Local → Remote destination | Access internal services from your laptop |
-R remote:host:local | Remote server | Remote → Local destination | Expose local dev server to the internet |
-D port | Local machine (SOCKS) | Dynamic — any destination | Ad-hoc proxy / bypass restrictions |
-J jumphost | N/A (proxy hop) | Through bastion to target | Reach hosts in private networks |