Bare Metal - Mind Palace

**Bare metal** refers to running workloads directly on physical server hardware — with no hypervisor, no virtual machine layer, and no container runtime between the application and the underlying CPUs, GPUs, memory, and storage. The operating system runs directly on the hardware, and the application has exclusive, unshared access to all resources. --- ### First Principle: Every abstraction layer costs performance. Bare metal has zero abstraction tax. [[VMs]] add a hypervisor. [[Docker Containers|Containers]] add a runtime. [[MIGs]] add GPU partitioning logic. Each layer provides flexibility and isolation at the cost of some overhead — typically 2–15% depending on the workload. Bare metal eliminates all of it, delivering the full theoretical performance of the hardware. --- ### Key Considerations - **Maximum Performance**: Bare metal provides the lowest latency and highest throughput for any given hardware configuration. This matters most for latency-sensitive workloads (real-time inference, high-frequency trading) and throughput-bound workloads (large-scale AI training). - **No Multi-Tenancy**: A bare metal server serves one tenant at a time. This is the tradeoff — you get full performance but lose the ability to share hardware across users. See [[multi-tenancy]]. - **Operational Complexity**: Without virtualisation, provisioning and reprovisioning bare metal is slower (minutes to hours vs seconds for containers). Tools like MAAS, Ironic, and PXE boot help automate this. - **GPU Access**: Bare metal provides direct GPU access without virtualisation overhead — critical for training workloads where every percentage of GPU utilisation translates to hours saved on multi-day training runs. - **Security Model**: The tenant controls the full stack from kernel up. This avoids hypervisor escape risks but requires the tenant to manage their own OS security. --- ### Actionable Insights In [[Modular Data Center Design Principles|modular data center]] deployments, bare metal is the right choice for dedicated AI training clusters where a single customer or workload owns the hardware full-time. For inference serving and shared environments, layer [[Docker Containers|containers]] or [[MIGs|MIG partitions]] on top of bare metal to improve utilisation. The [[Amax Hardware Set up|hardware configuration]] (e.g., AMAX 2U nodes with AMD processors) should be spec'd for bare metal first — virtualisation and containerisation layers can always be added, but bare metal performance is the ceiling. --- ### Where Bare Metal Sits ``` [[VLSI]] (transistors on silicon) → Bare Metal (physical server, direct OS) ← you are here → [[VMs]] (hardware virtualisation) → [[Docker Containers]] (OS-level virtualisation) → [[MIGs]] (GPU partitioning) ``` Moving up trades performance for flexibility and sharing. [[VMs]] | [[Docker Containers]] | [[MIGs]] | [[Amax Hardware Set up]] | [[multi-tenancy]] | [[Clustering]]