🧠Building Intelligent Assistants with LLMs Introduction to Retrieval-Augmented Generation (RAG) with Custom Data

Published: March 2025

Earlier this month, I had the opportunity to speak at the Nottingham DS&AI Meetup about one of my favorite topics: how to build intelligent assistants using Large Language Models (LLMs) combined with custom data through a technique called Retrieval-Augmented Generation (RAG).

Why RAG?

LLMs like GPT-4 are incredibly powerful but they’re limited by the data they were trained on, which can quickly become outdated. That’s where RAG comes in. RAG enhances the performance of LLMs by connecting them with an external retrieval system that pulls relevant documents or data before the model generates a response. This not only improves accuracy and factual consistency but also allows the assistant to access real-time or domain-specific information.

Key Concepts Covered

What is a foundation model?
The concept of adaptation for domain-specific needs.
Various adaptation techniques: Fine-tuning, prompt engineering, and parameter-efficient tuning (e.g., LoRA).
Deep dive into how RAG works: Includes retrieval systems, prompt construction, and similarity measurements using vector embeddings and cosine similarity.

Working with Custom Data

I demonstrated how to select and prepare a custom dataset from Kaggle, taking into account practical aspects like:

Dataset size: In this case, ~81MB.
Cleaning and preprocessing
Selecting fields for indexing
Sampling strategies for development

This part is crucial not just for performance, but also for cost management and storage optimization.

Integration with LangChain and OpenAI

To bring everything together, I used LangChain a framework that simplifies integrating components like:

LLM providers: (e.g., OpenAI)
Data sources
Prompts
Vector stores and embedding models

We also discussed the tradeoffs between using a framework like LangChain versus building everything from scratch (e.g., manual chunking, embeddings, DB setup, etc.).

Live Demo: Building a RAG-Based Assistant

I walked through a live demo using a Jupyter Notebook, showing step-by-step how to build a simple intelligent assistant using RAG. This included:

Loading and processing data
Creating vector embeddings
Setting up a similarity search
Constructing prompts dynamically
Calling the LLM to generate answers grounded in real data

Resources

The full presentation slides and the complete code demo are available on my GitHub:

🔗 github.com/rojoyin/ds-ai-meetup-llm-rag-demo

Feel free to explore the repo, run the code, and reach out if you have any questions or want to collaborate!