AWS Lambda Under the Hood

()

Lambda Architecture

Lambda is a serverless computer system that allows users to execute code on demand without managing servers.
Lambda supports synchronous and asynchronous invocation models.
Lambda's tenets include availability, efficiency, scale, security, and performance.

Invoke request routing connects microservices and provides availability, scale, and execution access.
Worker manager reuses previously created sandboxes to reduce initialization latency.
Assignment service replaced worker manager to provide reliable distributed and durable storage for sandbox states.
The introduction of a new node allows for easy rebuilding of the state from the log, significantly increasing system availability and making it fully tolerant to single host failures and availability zone events.
The distributed consistent sandbox state is implemented regionally, and a leader-follower architecture is applied for quick failovers.

Compute fabric owns all the infrastructure required to run code, including worker fleets, capacity manager, placement, and data science team for smart decision-making.
Rust was used to rewrite the new service, increasing efficiency and performance of every host, improving processing volume, and reducing overhead latency.

Data isolation is crucial to prevent interference between different functions running on the same worker.
Virtual machine isolation provides sufficient guarantees to run arbitrary code in a multi-tenant computer system.
Firecracker is a fast virtualization technology specifically designed for serverless compute needs, allowing multiplexing of thousands of functions from different customers on the same worker with consistent performance.
Firecracker provides strong isolation boundaries, is very fast with little system overhead, and enables decorrelation of demand to resources for better control of worker fleet heat.
A custom indirection layer enforces strict copy-on-read to eliminate shared memory and prevent security threats in a multi-tenant execution environment.
Introduced a callback interface to restore uniqueness of code after resuming multiple VMs from the same snapshot.

Snapshotting is used to reduce the cost of creating new execution environments by resuming VMs from snapshots instead of initializing them from scratch.
Implemented on-demand chunk loading to reduce snapshot distribution time and improve performance.
Utilized convergent encryption to deduplicate common chunks across container images and increase cache locality.
Addressed the issue of inefficient memory access by recording page access patterns and optimizing snapshot loading.
Enabled Lambda snapshot on Java functions for users to experience VM snapshot functionality.

Firecracker uses a distributed cache in multiple availability zones to maintain a coherent cache of the configuration database, making lookups faster.
The speaker is open to discussing how Lambda functions can be built in a company's own data center during a follow-up talk.
The same techniques used in Firecracker could be used to make EBS snapshots faster, but it would require more work due to the complexity of hardware and virtualization layers.
Different services communicate with each other using a mixture of synchronous request-response communication and GPC and HTTP2 streams, depending on the requirements of the particular communication.
Firecracker uses metal instances because they meet the requirements of the system, while nested virtualization would be much slower.
During Lambda function updates, the previous function version is used until the snapshot of the updated function is finished, at which point the system switches to the latest version.
The engineering process balances security, efficiency, and latency, with security being the top priority.