Bytetalks ep3: Build Streaming Pipelines in Python with Bytewax

By Zander Matheson & Laura Funderburk

None

Share article:

Welcome back to our weekly 🐝 Bytetalks series, where we explore the latest in streaming analytics data flows with Python and Bytewax. In this episode, Laura Funderburk & Zander Matheson present the Bytewax Cheatsheet—a comprehensive guide to building efficient, real-time data flows.

We’ll cover key concepts like data parallelism, clustering, partitioning, and recovery, providing insights into how Bytewax manages data streams behind the scenes.

Discover how to set up data flows using the Bytewax Python API, connect to data sources like Kafka, and deploy your pipelines on various platforms, from Kubernetes clusters to Raspberry Pi.

We’ll showcase practical examples and offer tips on optimizing your data processing workflows. This episode also sets the stage for our next session, where we’ll focus on Bytewax operators, including stateful, stateless, and windowing techniques.

Make sure to check out the Bytewax Cheatsheet, which is a practical guide for understanding Bytewax and improving your data processing workflows.

P.S. If you missed the first two episodes, catch up with the links below to watch and learn everything you need!

🐝 Bytetalks ep.1: Real-Time Analytics with Bytewax & ClickHouse

🐝 Bytetalks ep.2: Real-Time Embeddings with Azure AI & Bytewax

Bytetalks ep4: Operators in action

Laura Funderburk

Laura Funderburk

Senior Developer Advocate

Laura Funderburk holds a B.Sc. in Mathematics from Simon Fraser University and has extensive work experience as a data scientist. She is passionate about leveraging open source for MLOps and DataOps and is dedicated to outreach and education.

Zander Matheson

Zander Matheson

CEO, Founder

Zander is a seasoned data engineer who has founded and currently helms Bytewax. Zander has worked in the data space since 2014 at Heroku, GitHub, and an NLP startup. Before that, he attended business school at the UT Austin and HEC Paris in Europe.

Bytetalks Ep. 2: Real-Time Embeddings with Azure AI & Bytewax

Other posts you may find interesting

View all articles

The Rise of The Streaming Data Lakehouse

The Rise of The Streaming Data Lakehouse

Data Lakehouses unify flexible storage with governance, addressing real-time data processing and batch workloads

Written by Zander Matheson

None

Introducing bytewax-redis: Unlocking Real-Time AI with Redis and Bytewax

Redis, known for its speed and versatility, is now integrated with Bytewax for real-time feature serving.

Written by Zander Matheson

Bytetalks ep4 - Operators in action

Bytetalks ep4: Operators in action

Understand how Bytewax operators manage state and caching to optimize streaming data flows with live coding demos.

Written by Zander Matheson & Laura Funderburk