Cooking up real-time pizza order analytics

By Oli Makhasoeva & Anastasia Khomyakova

Join us on March 19th for a collaborative 🌎 virtual workshop with Startree, Streamlit, and support from AICamp, where we'll guide you through creating a real-time pizza 🍕 analytics dashboard in just two hours.

This 🆓 event not only offers learning opportunities but also the chance to win some amazing swag 🎁 for active participants.

❗️ Please keep in mind, we'll be handling all communications and answering your questions through our Slack channel - #workshop-room, so it's the perfect time to join if you haven't already.

Let's take a closer look at what the workshop has in store pizza shop for you.

Takeaway

You will be cooking up a real-time analytics dashboard for the operators of All About That Dough (AATD), an online pizza delivery service that specalizes in pizzas with Indian toppings. They will use the dashboard to get a live view on the number of orders and revenue of their business and to keep an eye on the most popular products.

Workshop prerequisites

  • The workshop can be completed on Windows, macOS, or Linux. The host will be using Python 3.11 on MacOS.
  • Join our Slack workspace, where we will have a dedicated channel for the workshop #workshop-room. Please ensure you're in prior to the event. It'll be available after the workshop, too.
  • To run the app, you'll need docker.
  • All code is here!

2024-03-14 22.07.11.jpg

Learning Objectives

By participating in this workshop, you'll learn how to:

  • Build a streaming pipeline to join data from multiple sources using 🐝 Bytewax.
  • Analyze and aggregate the data to return live metrics using 🍷 Apache Pinot.
  • Build a real-time dashboard to monitor the metrics with 🎈 Streamlit.

Architecture diagram

Below is a diagram showing how all the components work together ->

rchitecture diagram

We will be focusing on the parts within the dashed line rectangle.

The agenda of the workshop

1. Introduction (10 min)

We start with briefly introducing the instructors and the technologies we will use.

Bytewax ➡️ is an open-source Python framework that simplifies building apps for streaming data. It's developed for real-time processing and supports aggregation, windowing and splitting/joining streams, making it easier to handle big projects.

Pinot ➡️ is a real time distributed OLAP datastore, designed to answer OLAP queries with low latency. With user-facing applications querying Pinot directly, it can serve hundreds of thousands of concurrent queries per second.

Streamlit ➡️ is a Python library that transforms data scripts into interactive web apps quickly and with minimal coding. It's ideal for creating data visualizations and dashboards, streamlining the development process for data scientists and developers alike.

1.1 Orders

Creating the orders and products is out of the scope of this workshop. Here is what we need to know:

  • The orders service generates and publishes orders to a Kafka topic.
  • Products are also published in a separate Apache Kafka topic.
  • There are an infinite number of orders, which comprise the given products and are made by one of the users.

2. How we read and enrich data with Bytewax (30 min)

Order items are initially contained inside orders, and we use Bytewax to extract them and join them with product details before publishing them to the enriched-order-items Kafka topic. Expect to learn about:

3. How we analyze the data with Apache Pinot (30 min)

We store the data from the orders and enriched-order-items topics in Apache Pinot. Each topic has its own table and associated schema. We count the number of orders per minute and the revenue per minute, as well as find the most popular items and categories.

Compelling metrics to calculate:
1. Orders
     - Count of orders per minute
     - Revenue per minute

2. Enriched-Order-Items
     - Most popular items
     - Most popular categories

A brief outline of this section:

4. How we visualize data with Streamlit (30 min)

The code for the dashboard is written using Streamlit, a Python based framework for building interactive web applications. We query Pinot using its Python client and render results using Pandas' DataFrames and plot.ly charts. Dive into Streamlit world:

  • What is Streamlit? 🎈

    • The basics
    • Dynamic data apps in just a few lines of code
    • Share data insights across teams and with the world
    • Seamlessly composable – compatible with your fave Python library or GenAI stack
    • How can you get involved with Streamlit?
  • Walk through the Streamlit app code for this demo

  • Streamlit’s execution model and how to modify it

  • Bonus! What's next?

    • editing order info and writing it back with st.data_editor

5. Demo (10 min)

We demo the app and recap the main points step by step.

6. Assessment and Q&A

If time allows, we will expand on how to use the techniques from this workshop to visualize your data and build your own pipelines that can process your streaming data in real-time in a performant way.

Instructors info

Workshop Instructors info

Zander Matheson, Founder & CEO at Bytewax Zander is a seasoned data engineer who has founded and currently helms Bytewax. Zander has worked in the data space since 2014 at Heroku, GitHub, and an NLP startup. Before that, he attended business school at the UT Austin and HEC Paris in Europe.


Viktor Gamov is the Head of Developer Advocacy at StarTree, a pioneering company in real-time analytics with Apache Pinot. Viktor is known for his insightful presentations at top industry events like JavaOne, Devoxx, Kafka Summit, and QCon. His expertise spans distributed systems, real-time data streaming, JVM, and DevOps.


Caroline Frasca works on developer relations and partnerships for Streamlit's open-source Python library. Previously, she led customer success for Streamlit pre-acquisition and worked as a Solution Architect at Klaviyo. Caroline is also a UNC Chapel Hill Tar Heel, has two rambunctious cats, and loves to crochet.

Your input is valuable to us! ⭐️

Have suggestions for making this document better? We'd love to chat with you in the #questions-answered channel on Slack!

The full video can be find here ➡️

Stay updated with our newsletter

Subscribe and never miss another blog post, announcement, or community event.

Previous post
Oli Makhasoeva

Oli Makhasoeva

Director of Developer Relations and Operations
Oli is a passionate technologist with a background in engineering, consulting, and community building. On a break from creating content, she loves to network online & in person at meetups, conferences, and forums.

Anastasia Khomyakova

Author
Next post