Reduce your TCO by 80% compared to Apache Flink®

Stream processing
as easy as Py

Build real-time streaming pipelines 5x faster with 80% lower TCO and deliver cutting-edge AI use cases in any environment, from edge to cloud.

Start with > pip install bytewax
and scale with our platform!

Developer start Open Source

Trusted by innovative organizations

Overview

What is Bytewax?

Bytewax is a complete data processing solution combining our core open source library with powerful modules and connectors to extend the core library and a capable platform for orchestrating and governing your data processing.

Learn more

Python-Native Stream Processing

Build your pipelines in Python

Write more powerful data streaming pipelines in less lines of code! With Bytewax, there is now a powerful Python-native stateful stream processor that makes it easy for you to set up and deploy your dataflows in Python, allowing you to tap into its vast library ecosystem to perform advanced data transformations far beyond the reach of SQL, all while abstracting much of the underlying complexity and handling the difficult parts for you.

Get started

from datetime import timedelta 
import numpy as np 
from bytewax.operators import window as window_op 
from bytewax.operators.window import TumblingWindow, SystemClockConfig  

cc = SystemClockConfig() 
wc = TumblingWindow(length=timedelta(seconds=1))  

def build_array():
     return np.empty(0)  
def insert_value(np_array, value):
     return np.insert(np_array, 0, value)  

windowed_stream = wop.fold_window("window", stream, cc, 
wc, build_array, insert_value)

Modules

Extend Bytewax with
connectors, operators, and E2E dataflows

Our Module Hub extends the open source dataflows framework with pre-built connectors to countless sources and sinks, advanced operators, and E2E dataflows

Explore modules

Deployment

Deploy with
a single command `$ waxctl dataflow deploy my_dataflow.py`

Ease of deployment is a critical aspect of enabling agile development within a CI/CD framework. By using our command-line interface, waxctl, you can seamlessly deploy the same code you wrote and tested locally across a cluster of machines with a single command.

Get waxctl

Deploy dataflows anywhere, from edge to cloud:

Pure Python

Run Bytewax anywhere Python runs, even in Jupyter Notebook or Google Colab

Learn more

Virtual Machine

With just one command deploy and manage dataflows on AWS EC2, GCP, or Azure.

Learn more

Kubernetes

Deploy on Kubernetes in the cloud, on-premises, locally, or on edge devices

Learn more

Bytewax Platform

Secure, scale, and manage
your dataflows with the Bytewax Platform

Secure, scale, manage, and operate your dataflows with the Bytewax Platform. Enhance your data streaming operations with robust observability, advanced management APIs, disaster recovery, cloud backup, and autoscaling capabilities. Streamline your data processes and ensure high availability and resilience effortlessly.

Discover platform

Secure, scale, and manage your dataflows with the Bytewax Platform

Ships with powerful orchestration & governance features:

Cloud Backup

Extended Disaster Recovery

The Platform’s cloud backup feature automatically saves local worker states to secure cloud storage like AWS S3. In case of system failures, you can quickly recover your data streams and resume operations from saved checkpoints, ensuring effective disaster recovery.

Learn more

00:00 / 00:00

Management Dashboard

Deploy, manage and debug dataflows

The Bytewax Management Dashboard provides a user-friendly interface for managing and monitoring dataflows in Kubernetes setups, facilitating easy deployment, management, and real-time tracking of data streams.

Learn more

Modules

Plug-and-Play Extensions

Bytewax is built in a modular approach. As part of the Platform you gain access to a wide range of premium connectors, custom operators and pre-built end-to-end dataflows for your use case.

Learn more

Bytewax vs. Flink

Developer-friendly stream processing
for Python - 100% JVM free

Bytewax features a modern architecture that combines the performance of a Rust engine for distributed, parallel streaming with the ease of use of Python. The outcome is a stateful stream processor that rivals the functionality and performance of traditional Java-based tools like Flink, without any of the drawbacks. Enable all your Python teams to work with streaming!

Bytewax dramatically reduces your Total Cost of Ownership (TCO) by a 5x on average

For table stakes stream processing use cases in Python, Bytewax can accelerate time to production by up to 8x

25x

Bytewax is radically more memory efficient than Flink, using 7x–25x less memory for common streaming workloads

Compared to Apache Flink®:*

Bytewax dramatically reduces your Total Cost of Ownership (TCO) by a 5x on average

For table stakes stream processing use cases in Python, Bytewax can accelerate time to production by up to 8x

25x

Bytewax is radically more memory efficient than Flink, using 7x–25x less memory for common streaming workloads

*Source: Data Stream Processing Ease of Use and TCO, McKnight Consulting Group (2024)

Loved by developers
working on:

GenAI

Build real-time feature pipelines and generate embeddings to vector DBs

Learn more

Machine Learning

Use real-time streaming data with the leading ML libraries in Python

Learn more

IoT

Bring the power of stateful streaming to air-gapped or edge environments

Learn more

Developer voices

Loved by the data community 💛

We went from 5 days of training to 5 minutes DIY. Anyone with a limited Python background can just get going immediately. A defensible 10x reduction in infrastructure cost.

Nathan VerrillSenior Principal Machine Learning Engineer

I have a lot of fun integrating Bytewax into my projects. It brings a lot of value in removing the resistance to streaming technologies in Python’s ecosystem. Before tools like Bytewax, using a streaming engine was a real headache. It’s totally exciting to be part of this movement 🔥

Paul IusztinSenior ML Engineer

We use Flink a lot internally, but after picking up Bytewax we are looking for more and more real-time ML workloads to use Bytewax with because we find it to be more accessible and faster to set up than Flink

Leading Aerospace Company

We have been using Bytewax for well over a year and incredibly happy with the performance and support. I was able to ship the real-time analysis feature we needed at Hark in under a week and it’s been delightful to work with the Bytewax team.

Erlend Faxvaag JohnsenData Engineer @ Hark

Bytewax is simple enough that we can quickly prove ahead of time that we can solve a problem and then use the same tool to scale it and move it to production.

Engineering ConsultancyCTO

Libraries like Bytewax 🐝 expose a pure Python API on top of a highly-efficient language like Rust. So you get the best of both worlds. Rust's speed and performance, plus Python' rich ecosystem of libraries.

Pau Labarta BajoML Engineer / Data Scientist

I was a fan of batch things but after I discovered how easy is to implement a streaming pipelines with Bytewax, I changed my mind 😅

Alex VesaSenior AI Engineer

The key difference between Apache Spark and Bytewax for me teaching my class on ML systems is that it takes me around six lectures to bring students up to the level where they can begin utilizing Spark. However, I only need one lecture to do the same with Bytewax.

RJ NowlingAssociate Professor Electrical Engineering and Computer Science Milwauke School of Engineering

Python alone is not a language designed for speed 🐢, which makes it unsuitable for real-time processing. Because of this, real-time feature pipelines were usually writen with Java-based tools like Apache Spark or Apache Flink. However, things are changing fast with the emergence of Rust 🦀 and libraries like Bytewax 🐝 that expose a pure Python API on top of a highly-efficient language like Rust.

Soufiene YakoubiAI/DeepLearning/Computer Vision Researcher

Setting up Bytewax was incredibly straightforward, allowing us to go from pip to a fully operational dataflow in just minutes, without the hassle of complex build files and classpath issues found in JVM-based solutions. Remarkably, our production deployments have been rock-solid, seamlessly indexing multiple blockchains with unwavering reliability and no drama.

Aris Koliopoulos CTO @ Rated Network

We have used Bytewax to develop a recommender system for a video streaming platform. I like to think in MapReduce terms when I do data processing, so I was super happy to find that Bytewax does precisely what I need and is easy to deploy and support.

Danil PetrovLead AI Engineer

Stream processing as easy as Py

What is Bytewax?

Build your pipelines in Python

Extend Bytewax with connectors, operators, and E2E dataflows

Deploy with a single command $ waxctl dataflow deploy my_dataflow.py

Deploy dataflows anywhere, from edge to cloud:

Pure Python

Virtual Machine

Kubernetes

Secure, scale, and manage your dataflows with the Bytewax Platform

Ships with powerful orchestration & governance features:

Extended Disaster Recovery

Deploy, manage and debug dataflows

Plug-and-Play Extensions

Developer-friendly stream processing for Python - 100% JVM free

Compared to Apache Flink®:*

Loved by developers working on:

GenAI

Machine Learning

IoT

Loved by the data community 💛

Stream processing
as easy as Py

Extend Bytewax with
connectors, operators, and E2E dataflows

Deploy with
a single command `$ waxctl dataflow deploy my_dataflow.py`

Secure, scale, and manage
your dataflows with the Bytewax Platform

Developer-friendly stream processing
for Python - 100% JVM free

Loved by developers
working on: