Bytetalks Ep. 2: Real-Time Embeddings with Azure AI & Bytewax

By Zander Matheson & Laura Funderburk

Welcome back to another Bytewax session! In this video, Zander and Laura team up to explore the powerful integration between Azure AI Search and Bytewax, specifically focusing on real-time indexing pipelines. (Blog: https://bytewax.io/blog/introducing-the-azure-ai-search-bytewax-sink)

💡 What You’ll Learn:

  • Overview of Azure AI Search: Discover how Azure AI Search (formerly Azure Document Intelligence) works as a vector store, enabling advanced retrieval augmented generation (RAG) techniques.
  • Custom Sink for Real-Time Indexing: Learn about the custom sink that Bytewax has developed, which allows you to populate Azure AI Search with vectors in real-time. Laura breaks down the process, from setting up your Azure AI Search instance to defining schemas and embedding models.
  • Step-by-Step Demo: Watch a hands-on demonstration where Laura shows how to transform raw text data into vectors, index them in Azure AI Search, and keep your data pipelines up to date with constantly changing information like news articles.
  • Practical Use Cases: Understand the practical applications of this setup, whether you're dealing with news articles, product reviews, or social media posts, and see how easily you can adapt the workflow to different data sources.

📖 Key Topics Covered:

  • Introduction to Azure AI Search and vector stores
  • Embedding models and their role in NLP
  • Setting up and configuring Azure AI Search with Bytewax
  • Building and running real-time indexing pipelines
  • Live demo of the indexing process with Python

🔗 Resources:

🐝 Be sure to like and subscribe for more updates, and feel free to drop any questions in the comments below!

P.S. If you missed last week’s video on enabling real-time analytics with Bytewax and ClickHouse, be sure to check it out here

Stay updated with our newsletter

Subscribe and never miss another blog post, announcement, or community event.

Previous post
Zander Matheson

Zander Matheson

CEO, Founder
Zander is a seasoned data engineer who has founded and currently helms Bytewax. Zander has worked in the data space since 2014 at Heroku, GitHub, and an NLP startup. Before that, he attended business school at the UT Austin and HEC Paris in Europe.

Laura Funderburk

Senior Developer Advocate
Laura Funderburk holds a B.Sc. in Mathematics from Simon Fraser University and has extensive work experience as a data scientist. She is passionate about leveraging open source for MLOps and DataOps and is dedicated to outreach and education.
Next post