🧠Building Intelligent Assistants with LLMs Introduction to Retrieval-Augmented Generation (RAG) with Custom Data
Published: March 2025
Earlier this month, I had the opportunity to speak at the Nottingham DS&AI Meetup about one of my favorite topics: how to build intelligent assistants using Large Language Models (LLMs) combined with custom data through a technique called Retrieval-Augmented Generation (RAG).


Why RAG?
LLMs like GPT-4 are incredibly powerful but they’re limited by the data they were trained on, which can quickly become outdated. That’s where RAG comes in. RAG enhances the performance of LLMs by connecting them with an external retrieval system that pulls relevant documents or data before the model generates a response. This not only improves accuracy and factual consistency but also allows the assistant to access real-time or domain-specific information.
Key Concepts Covered
- What is a foundation model?
- The concept of adaptation for domain-specific needs.
- Various adaptation techniques: Fine-tuning, prompt engineering, and parameter-efficient tuning (e.g., LoRA).
- Deep dive into how RAG works: Includes retrieval systems, prompt construction, and similarity measurements using vector embeddings and cosine similarity.
Working with Custom Data
I demonstrated how to select and prepare a custom dataset from Kaggle, taking into account practical aspects like:
- Dataset size: In this case, ~81MB.
- Cleaning and preprocessing
- Selecting fields for indexing
- Sampling strategies for development
This part is crucial not just for performance, but also for cost management and storage optimization.
Integration with LangChain and OpenAI
To bring everything together, I used LangChain a framework that simplifies integrating components like:
- LLM providers: (e.g., OpenAI)
- Data sources
- Prompts
- Vector stores and embedding models
We also discussed the tradeoffs between using a framework like LangChain versus building everything from scratch (e.g., manual chunking, embeddings, DB setup, etc.).
Live Demo: Building a RAG-Based Assistant
I walked through a live demo using a Jupyter Notebook, showing step-by-step how to build a simple intelligent assistant using RAG. This included:
- Loading and processing data
- Creating vector embeddings
- Setting up a similarity search
- Constructing prompts dynamically
- Calling the LLM to generate answers grounded in real data
Resources
The full presentation slides and the complete code demo are available on my GitHub:
🔗 github.com/rojoyin/ds-ai-meetup-llm-rag-demo
Feel free to explore the repo, run the code, and reach out if you have any questions or want to collaborate!
