Stream processing
purely in Python
Open source framework and distributed stream processing engine. Build streaming data pipelines and real-time apps with everything you need: recovery, scalability, windowing, aggregations, and connectors.
Leverage the Python Ecosystem
Build streaming data applications easily. In Python.
Easy install
> pip install bytewax
Stateful operations like windowing and aggregations
from datetime import timedelta
import numpy as np
from bytewax.window import TumblingWindowConfig, SystemClockConfig
cc = SystemClockConfig()
wc = TumblingWindowConfig(length=timedelta(seconds=1))
def build_array():
return np.empty(0)
def insert_value(np_array, value):
return np.insert(np_array, 0, value)
flow.fold_window("window", cc, wc, build_array, insert_value)
Use the Python tools you are familiar with
import numpy as np
flow.map(lambda x: np.mean(x[1]))
Connect to data sources
flow.input(
"events",
KafkaInputConfig(
brokers=["localhost:9092"],
topic="web_events"
)
)
flow.capture(
KafkaOutputConfig(
brokers=["localhost:9092"],
topic="ip_address_by_location"
)
)
Run locally
> python my_dataflow.py
Deploy anywhere
> waxctl df deploy my_dataflow.py
Python Native
No JVM required! Leverage the entirety of the python ecosystem, from Jupyter notebooks and Hugging Face transformers to Streamlit.
Stateful
Stateful operators allow you to do advanced processing like joins, windows, and aggregations
Recoverable
Built-in state recovery with multiple state backend options for critical workloads
Native Connectors
Built-in integrations for Kafka, Redpanda, DynamoDB, BigQuery, and more...
Scalable
Bytewax can scale across thousands of workers to meet the most demanding workloads.
Performant
The Rust engine keeps things performant.