In the world of system design, "The Cloud" is often synonymous with scalability. But for Netflix, "The Cloud" is only half the story. While AWS handles the heavy lifting of user metadata, recommendation engines, and UI processing, it doesn't serve a single byte of video.
When you hit Play, you are no longer communicating with a data center in Virginia or Ireland. Instead, you are talking to a custom-built, high-performance box sitting inside your own ISP's rack. This is Open Connect, Netflix’s globally distributed, purpose-built Content Delivery Network (CDN).
In this edition of System Design Deconstructed, we look at the architecture of the Open Connect Appliance (OCA), the "directed cache" philosophy, and how Netflix achieves nearly 800 Gbps of throughput from a single server.
1. The Strategic Shift: Why Build a Custom CDN?
Before 2012, Netflix relied on third-party CDNs like Akamai and Limelight. However, as Netflix’s traffic began to represent a double-digit percentage of global internet volume, the standard "demand-driven" CDN model reached its limit.
Traditional CDNs are general-purpose. They cache content based on popularity after it’s requested. If a user in Lagos requests a video that isn't in the local cache, the CDN fetches it from an origin server, causing latency.
Netflix realized they had an advantage third-party CDNs didn't: Predictability.
They know exactly what content is in their library.
They know the viewing habits of every region.
They control the client-side video player.
By building their own CDN, they shifted from a reactive model to a proactive one.
2. The Hardware: Open Connect Appliances (OCAs)
Netflix doesn't use generic rack servers. They design their own hardware Open Connect Appliances (OCAs) and provide them for free to ISPs. These boxes come in two primary flavors:
The Storage OCA (The Library)
A 2U beast designed for massive capacity. These use high-density HDDs (up to several petabytes) to store the "Long Tail" the massive library of content that isn't trending but still needs to be available locally.
The Flash OCA (The Speedster)
A 1U system designed for maximum throughput. These use NVMe SSDs to serve the "Top 10" and trending shows. This is where the 100 Gbps and now 800 Gbps magic happens.
Hardware Specs (Approximate):
CPU: High-core count Intel or AMD (single socket preferred to reduce NUMA latency).
RAM: 256GB+ (primarily for OS kernel buffers and metadata).
Network: Up to 2x100GbE or even 400GbE interfaces in newer iterations.
Storage: All-flash (SSD) arrays for high-demand PoPs.
3. The Software Stack: Why FreeBSD?
The software running on an OCA is a masterclass in performance optimization. Netflix famously uses FreeBSD (specifically the -CURRENT branch) rather than Linux.
Why FreeBSD?
The choice comes down to the efficiency of the Network Stack. Netflix engineers work directly with the FreeBSD community to optimize how data moves from the disk to the network card.
Zero-Copy Architecture (
sendfile): In a typical OS, sending a file over a socket involves copying data from disk to kernel space, then to user space (the web server), then back to kernel space for the network card. Netflix uses thesendfile()system call, which tells the kernel to move data directly from the disk cache/buffer to the network interface, bypassing user space entirely.Kernel-level TLS (kTLS): Encrypting massive amounts of traffic is CPU-intensive. By moving TLS encryption into the kernel, the OCA can encrypt data as it is being sent through the
sendfilepipeline. This eliminates the context-switching overhead between the application (Nginx) and the kernel.Nginx: The web server of choice. It acts as the orchestration layer, handling the HTTP requests from clients and handing off the heavy lifting to the FreeBSD kernel.
4. The Directed Cache Philosophy
Most CDNs use "Least Recently Used" (LRU) eviction policies. If the cache is full, they delete the oldest item. Open Connect uses Directed Caching.
Every night, during "Fill Windows" (when local ISP traffic is lowest), the Open Connect control plane (running on AWS) calculates what content needs to be where.
If Stranger Things Season 5 drops tomorrow, the control plane pushes those bits to every OCA worldwide before anyone asks for them.
If a show loses popularity in Japan but spikes in Brazil, the OCAs in São Paulo are updated while the ones in Tokyo make room for something else.
This proactive replication ensures a Cache Hit Ratio of >95%, significantly higher than general-purpose CDNs.
5. Traffic Steering: How "Play" Finds the Box
When a user clicks "Play," the Netflix app doesn't just "guess" which server to use. It involves a sophisticated steering dance:
BGP (Border Gateway Protocol): OCAs establish BGP sessions with the ISP’s routers. They "announce" the IP ranges they are capable of serving.
The Control Plane: The OCA reports its health, load, and content list to the Cache Control Service in AWS.
The Steering Service: When you click play, your client sends a request to AWS. The Steering Service looks at your IP, finds the OCAs that have advertised your prefix via BGP, checks which ones are least loaded and closest to you, and returns a manifest of URLs pointing directly to those specific OCAs.
6. Pushing the Limits: 800 Gbps and NUMA
As of 2022/2023, Netflix has been testing and deploying servers capable of 800 Gbps of throughput. Achieving this requires overcoming NUMA (Non-Uniform Memory Access) bottlenecks.
On multi-socket motherboards, a CPU might try to access RAM or an SSD attached to the other CPU. This causes latency spikes. Netflix optimizes for NUMA Locality: ensuring that the CPU core handling a network interrupt is the same one that "owns" the memory and PCIe lanes for the NIC and the SSD being accessed.
By pinning threads and memory to specific NUMA domains, they ensure the data path is as short and "local" as possible. Furthermore, by using NIC kTLS offload, the encryption happens on the network card itself, freeing up the CPU for even higher traffic volumes.
7. Sustainability and Efficiency
Netflix has recently shifted focus toward Power Efficiency. In 2023, they reached a milestone of a 100 Gbps server consuming only 100W of power using Nvidia Bluefield-3 SmartNICs. This isn't just about speed anymore; it's about the "efficiency of the bit."
Summary: Lessons for System Designers
Netflix Open Connect teaches us three vital lessons in large-scale architecture:
Efficiency over Generality: If you have a specific workload (video streaming), a general-purpose tool (standard CDN) will never be as efficient as a custom-built one.
Push Logic to the Edge: By moving content to the ISP’s basement, Netflix eliminates the "Middle Mile" of the internet, reducing costs for both themselves and the ISPs.
OS Matters: When you are operating at the limits of hardware, the choice of Operating System and your ability to tweak the kernel (as they do with FreeBSD) becomes a competitive advantage.
References & Further Reading
For those who want to dive deeper into the actual engineering logs and conference talks from the Netflix team, these are the essential resources:
Netflix Tech Blog: Driving Content Delivery Efficiency - A deep dive into how Netflix classifies cache misses to optimize their delivery logic.
Netflix Tech Blog: Serving Video at 800Gb/s - The definitive technical breakdown of the hardware and software optimizations required for near-terabit speeds.
EuroBSDCon: Serving Netflix Video at 400Gb/s on FreeBSD - Slides and paper by Drew Gallatin detailing NUMA optimizations and kTLS.
Open Connect Overview (Official PDF) - The high-level architectural overview provided to ISP partners.
Netflix Tech Blog: Content Popularity for Open Connect - How Netflix uses data science to decide which bits go to which boxes.