Have you found yourself wondering why isn’t there a better way to work with streaming data? Why isn’t there a framework that is performant and doesn’t require additional infrastructure or a giant compute cluster just to get started, but can still scale from a single machine to whatever craziness you have in mind? Why isn’t there something that takes advantage of the fastest-growing ecosystem of tools for analysis and transformation and allows you to work with what you know and understand?
Wonder no longer. Today we are extremely excited to open source Bytewax, a powerful new way to transform streaming data. Bytewax is a framework that enables Data Scientists and Data Engineers to quickly and effectively build with data streams.
When we started dreaming up a new streaming solution, we identified a combination we love: the fast-growing Python community and its powerful ecosystem of tools with a highly-performant and scalable Rust back end to get the best of two worlds. All of this without having the overhead of learning another programming language and maintaining another distributed system. It sounds too good to be true, right 😄?
Bytewax is a general-purpose data processing engine that enables fast and scalable stream processing. It is a Python library built on top of Timely Dataflow that is meant to empower software engineers, data engineers and ML engineers to author, debug and maintain stream processing pipelines. Bytewax facilitates real-time data transformations, windowing, complex aggregations, and more.
We saw engineering teams invest a lot of time into infrastructure and systems just to start developing stream processing applications like fraud detection, network monitoring, or bot detection. Bytewax makes applications like these simple to develop and run locally. Bytewax makes it easy to scale code that runs locally to multiple workers or processes in production without any changes.
We made Bytewax because the data world needs a stream processing solution that is approachable, so even engineers who aren’t experts in stream processing can jump in and unlock the power of real-time data streams without spending months figuring out how to make it work.