Online event

Building Real-Time RAG for Financial Data and News

June 4, 2024

We're teaming up with Microsoft and Unstructured to bring you an incredible workshop, hosted by AICamp.

In this workshop, we will focus on designing RAG pipelines through data flow pipeline design including Directed Graphs (DG), and the integration of real-time analytics.

Discover the knowledge and skills needed to set up and manage real-time Retrieval Augmented Generation (RAG) pipelines using both structured and unstructured financial data.

One of the challenges of financial data is how quickly it becomes irrelevant - both in terms of the market prices and any events around it.

The event is šŸ†“ free and expected to last ā° 2 hours.

  • Leverage Bytewax for integrating real-time analytics into your data processing workflows.
  • Incorporate Unstructured to process image and web based information
  • Incorporate Azure AI services to deploy and manage RAG pipelines

Below is a diagram showing how all the components work together.

graph MSFT workshop.png

ā—ļø More details you can find in out blog.

Guiding you through this experience will be

Zander Matheson
CEO, Founder at Bytewax
Zander is a seasoned data engineer who has founded and currently helms Bytewax. Zander has worked in the data space since 2014 at Heroku, GitHub, and an NLP startup. Before that, he attended business school at the UT Austin and HEC Paris in Europe.
Laura Funderburk
Senior developer advocate at Bytewax
Laura Funderburk has a B.Sc. Mathematics from Simon Fraser University, and over three years of experience as a professional data scientist. Laura is enthusiastic about using open source for MLOps and DataOps and is passionate about outreach and education. In her day to day, Laura creates written content around building end to end scalable LLM pipelines with streaming data.
Shagun Sharma
Data Scientist at Microsoft via TCS
Shagun Sharma is a highly skilled Data Scientist and AI Engineer with over five years of extensive experience in the field of Natural Language Processing (NLP). For almost four years, Shagun has been a pivotal part of the AI Co-Innovation Lab, globally leading multiple customer engagements around generative AI. Shagun's leadership has extended to labs in Redmond, WA, Montevideo, Uruguay, and San Francisco, CA. Shagun has successfully built more than 15 proof-of-concept (POC) projects leveraging Azure technologies, including Azure OpenAI, Azure AI Search, Azure Document Intelligence, LangChain, and other advanced tools
Nina Lopatina
Staff Developer Relations Engineer at Unstructured
Nina Lopatina is a Staff Developer Relations Engineer at Unstructured, where she helps customers make the most of their unstructured data for retrieval augmented generation (RAG) and other large language model (LLM) use cases. Nina has been primarily working on multilingual language modeling since 2018. In this span, she has worked on language classification and generation. Throughout her career, she has focused on the data that LLMs need to improve performance and reliability.

šŸ™‹ā€ā™‚ļøšŸ™‹ā€ā™€ļø Audience

šŸ› ļø Data/ML/AI engineers;

šŸ”¬ Data scientists;

šŸ’» Software engineers interested in data processing;

šŸ“Š IT professionals looking to understand and apply RAG.

Jun 4, 2024, 5:00 PM ā€¢workshop

Workshop prerequisites

  1. Basic Understanding of Data Structures and Algorithms

    Familiarity with fundamental concepts in data structures and algorithms is required. This includes knowledge of arrays, linked lists, stacks, queues, trees, and basic algorithmic principles.

  2. Proficiency in Python Programming

    Comfortable with writing and debugging Python code. Experience with Python libraries commonly used in data processing and machine learning, such as Pandas, NumPy, and Scikit-learn.

  3. Knowledge of Data Processing and ETL Concepts

    Understanding of data extraction, transformation, and loading (ETL) processes. Experience with handling structured (e.g., CSV, JSON) and unstructured (e.g., text, images) data.

[Optional] To reproduce the solution - not required to participate in the webinar

  1. Setup of Required Azure Services
    • Azure AI Search: Follow the instructions here to create an Azure AI Search service.
    • Azure OpenAI: Set up the Azure OpenAI service and deploy the models 'gpt4 (0613)' and 'text-ada-002-embedding' by following the instructions here.
  2. Get Unstructured API Key and Install Unstructured Tools
  3. Clone this repository and install dependencies
    • git clone
    • cd real-time-rag-workshop/
    • Pip install -r requirements.txt

More details!

  • Nina Lopatina
    Staff Developer Relations Engineer at Unstructured
  • Shagun Sharma
    Data Scientist at Microsoft via TCS
  • Laura Funderburk
    Senior developer advocate at Bytewax
  • Zander Matheson
    CEO, Founder at Bytewax