Introducing bytewax-redis: Unlocking Real-Time AI with Redis and Bytewax

By Zander Matheson

As the CEO of Bytewax, I'm thrilled to announce the release of our latest module: bytewax-redis.

Modules are connectors and operators that enhance Bytewax with faster development with pre-built modules and more enhanced processing with advanced operators. For more details, you can visit the modules page on our website.

This integration combines the AI-native stream processing capabilities of Bytewax with the accessibility and flexibility of Redis. A great mix for feature pipelines and live inference.

Why Redis?

Redis has long been known for its speed, simplicity, and versatility as an in-memory data store. Its support for data structures like strings, hashes, lists, sets, and streams makes it an ideal candidate for a feature store—a centralized repository for serving up-to-date feature values to machine learning models in real-time.

By leveraging Redis as a feature store, organizations can:

  • Accelerate Model Inference: Serve features to models with minimal latency.
  • Simplify Architecture: Use a single system for both data storage and message streaming.
  • Enhance Scalability: Easily scale Redis to handle increasing workloads.

bytewax-redis: The Perfect Pairing

Our new bytewax-redis module allows you to seamlessly integrate Redis streams and key-value stores into your Bytewax dataflows. Whether you're consuming data from Redis streams, writing processed data back to them, or interacting with Redis' key-value storage, bytewax-redis has you covered.

If you are looking for the code shown in this post, you can find it in a single file in the examples directory in the GitHub repository.

Key Features

  • Redis Stream Source: Consume data from Redis streams as part of your dataflow.
  • Redis Stream Sink: Output processed data to Redis streams.
  • Redis Key-Value Sink: Write data directly to Redis key-value stores.

Getting Started: A Quick Example

Let's dive into a practical example that showcases how to use bytewax-redis in your Bytewax dataflows. We'll walk through three separate dataflows:

  1. Stream Producer: Writes data to a Redis stream.
  2. Stream Consumer: Reads data from the Redis stream.
  3. Key-Value Producer: Writes key-value pairs to Redis.

Setting Up

First, ensure you have Redis running locally or update the REDIS_HOST and REDIS_PORT variables to point to your Redis instance. I like to use redis stack which comes with a UI for debugging.

docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack:latest

wind1.png 🔍 Click to enlarge image

Next, you can install the module with pip.

pip install bytewax-redis

Now, in a dataflow file titled redis_connector.py you can configure the environment variables of a remote Redis instance or the local one you just ran in docker.

import os

REDIS_HOST = os.environ.get("REDIS_HOST", "localhost")
REDIS_PORT = int(os.environ.get("REDIS_PORT", 6379))
REDIS_DB = int(os.environ.get("REDIS_DB", 0))
REDIS_STREAM_NAME = os.environ.get("REDIS_STREAM_NAME", "example-stream")

Stream Producer Dataflow

This dataflow writes a series of dictionaries to a Redis stream.

from bytewax import operators as op
from bytewax.bytewax_redis import RedisStreamSink
from bytewax.dataflow import Dataflow
from bytewax.testing import TestingSource

stream_producer_flow = Dataflow("redis-stream-producer")
stream_data = [
    {"field-1": 1},
    {"field-1": 2},
    {"field-1": 3},
    {"field-1": 4},
    {"field-2": 1},
    {"field-2": 2},
]
stream_inp = op.input("test-input", stream_producer_flow, TestingSource(stream_data))

op.inspect("redis-stream-writing", stream_inp)
op.output(
    "redis-stream-out",
    stream_inp,
    RedisStreamSink(REDIS_STREAM_NAME, REDIS_HOST, REDIS_PORT, REDIS_DB),
)

Run the Stream Producer:

python -m bytewax.run redis_connector.py:stream_producer_flow

Stream Consumer Dataflow

We can read the data from Redis Streams in a new section of the same dataflow titled redis_connector.py. This dataflow simply reads the data from the Redis stream and prints it.

from bytewax.bytewax_redis import RedisStreamSource

consumer_flow = Dataflow("redis-consumer")
consumer_inp = op.input(
    "from_redis",
    consumer_flow,
    RedisStreamSource([REDIS_STREAM_NAME], REDIS_HOST, REDIS_PORT, REDIS_DB),
)
op.inspect("received", consumer_inp)

Run the Stream Consumer:

To run the consumer, we point bytewax to run our consumer_flow entrypoint in the same file.

python -m bytewax.run redis_connector.py.py:consumer_flow

Key-Value Producer Dataflow

We can also showcase a final capability that will write data to the Sink in the same dataflow. This dataflow simply writes key-value pairs to Redis.

from bytewax.bytewax_redis import RedisKVSink

kv_producer_flow = Dataflow("redis-kv-producer")
kv_data = [
    ("key-1", 1),
    ("key-1", 2),
    ("key-2", 1),
    ("key-1", 3),
    ("key-2", 5),
]
kv_inp = op.input("test-input", kv_producer_flow, TestingSource(kv_data))

op.inspect("redis-kv-writing", kv_inp)
op.output("redis-kv-out", kv_inp, RedisKVSink(REDIS_HOST, REDIS_PORT, REDIS_DB))

Run the Key-Value Producer:

python -m bytewax.run redis_connector.py:kv_producer_flow

Real-Time AI and Machine Learning with Redis and Bytewax

By combining Bytewax's robust data processing with Redis's speed and versatility, you can build systems that respond to data in real-time, a critical requirement for modern AI and machine learning applications.

Use Cases

  • Feature Engineering Pipelines: Compute and update features in real-time as new data arrives.
  • Online Learning Systems: Train models incrementally using streaming data.
  • Real-Time Analytics: Perform live analytics and update dashboards instantly.

Conclusion

The release of bytewax-redis marks a significant step toward simplifying the development of real-time AI and machine learning applications. We're excited to see what you'll build with this powerful integration.

Get Started Today

To start using bytewax-redis, install it via pip:

pip install bytewax-redis

Check out our GitHub repository for more examples and documentation.

More on Redis ❤️ Bytewax

Blog: Redis-driven Dataflow for Clickstream Aggregation

Video: Bytewax + Redis | Real-time streaming for AI


🐝 Thank you for being a part of the Bytewax community. We're committed to making real-time data processing accessible and efficient for developers everywhere. If you like this content, subscribe to us on substack or give us a ⭐️ on GitHub!

Stay updated with our newsletter

Subscribe and never miss another blog post, announcement, or community event.

Previous post
Zander Matheson

Zander Matheson

CEO, Founder
Zander is a seasoned data engineer who has founded and currently helms Bytewax. Zander has worked in the data space since 2014 at Heroku, GitHub, and an NLP startup. Before that, he attended business school at the UT Austin and HEC Paris in Europe.
Next post