Nursery rhyme aside, I’ve been looking avidly at Big Data Lambda Architectures. Nathan Marz introduced the term back in 2012, which is reminiscent of λ-Calculus. The first time you hear the term it brings memories of high-order functions in programming languages (functional or imperative, applications or systems). It is a layered architectural style, similar in nature, since its layered, to Pipes and Filters…but for Big Data.
The “lambda” portion of the term refers to data immutability at the Batch Layer (similar to pure functions). Quoting Nathan, “The batch layer needs to be able to do two things to do its job: store an immutable, constantly growing master dataset, and compute arbitrary functions on that dataset.”
Lambda Architectures are not new, but I think that Nathan had the great idea of giving it a name. Reminds me of the Allegory of the Cave by Plato, where while in the cave, we professionals have most likely seen a lambda architecture in one way or another, or participated in developing, or even created products with similar characteristics, but could not describe it fully. Nathan departed from the cave and saw the general aspects of the architectural style and then returned to the cave and described the real thing, giving a name that, whether you like it or not, is here to stay. Hey, it’s catchy and intriguing!
Lately I’ve been thinking that Lambda Architectures come in two flavors: Big Lambda Architectures (𝚲-Architecture), which is what Nathan describes, and Little Lambda architectures (λ-Architecture), and the differentiation has to do with how much the underlying technologies can scale, and one would choose such technologies on purpose instead of “scaling down” Big Data products. Otherwise, how would architectures that have the three layers and same functions, but don’t scale as much be called? I can think of analytics products that have evolved from little lambda to big lambda, too, so little lambdas must exist.
Take it with a little grain of data salt 😉