Bauplan: Zero-copy, Scale-up FaaS for Data Pipelines
Chaining functions for longer workloads is a key use case for FaaS platforms in data applications. However, modern data pipelines differ significantly from typical serverless use cases (e.g., webhooks and microservices); this makes it difficult to retrofit existing frameworks due to structural constraints. In this paper, we describe these limitations in detail and introduce \textit{bauplan}, a novel FaaS programming model and serverless runtime designed for data practitioners. \textit{bauplan} enables users to declaratively define functional Directed Acyclic Graphs (DAGs) along with their runtime environments, which are then efficiently executed on cloud-based workers. We show that \textit{bauplan} achieves both better performance and a superior developer experience for data workloads by making the trade-off of reducing generality in favor of data-awareness.