Building Efficient RAG pipeline using Open Source LLMs PyCon MY

Building Efficient RAG pipeline using Open Source LLMs
.ical

08-24, 13:45–15:15 (Asia/Kuala_Lumpur), Classroom

Large Language models are all over the place, driving the advancement of AI in today's era. For enterprises and businesses, integrating LLM with custom data sources is crucial to provide more contextual understanding and reduce hallucinations. In my talk, Tarun will emphasize on building an effective RAG pipeline for production using Open Source LLMs. In simple words, Retrieval Augmented Generation involves retrieving relevant documents as context for user queries and leveraging LLMs to generate more accurate responses.

It is a fully hands-on workshop/session, where participants will construct an entire RAG pipeline using open-source: LLMs, vector databases, and embeddings.

Problem Statement

Closed-source models like GPT, Claude, and Gemini demonstrate significant potential as LLMs, but enterprises and startups with sensitive data hesitate to rely on them due to data privacy and security concerns.
While numerous solutions and resources on the internet utilize closed-source models like GPT and Gemini to construct RAG pipelines, there is limited information available on building effective RAG pipelines using Open Source LLMs.
When it comes to using Open Source LLM, it is important to understand the prompt template to use to get response in specific format. While those with a basic grasp of Transformers can adjust parameters to enhance results, this approach may not be suitable for everyone.
Basic RAG solutions often struggle and tend to produce hallucinations.

Session Outline

A hands-on workshop/session, where participants will construct an entire RAG pipeline using open-source: LLMs, vector databases, and embeddings. Additionally, the speaker will demonstrate two advanced techniques aimed at improving results from LLMs. Below is the outline of my workshop talk:

Issues with Large Language Models
Understand the need of RAG and Open Source LLMs
Prompt Engineering Basics - Zero Shot and Few Shot
Open Source LLMs parameters tour: Understanding temperature, top_p and so on.
Building basic RAG pipeline using Open Source LLMs, embeddings, and vector stores.
Advanced Technique: 1- Using Cross Encoders Sentence transformers to Re-rank
Advanced Technique: 2- Fine Tune Embeddings for RAG and Hybrid Search
Build Streamlit app for your RAG application
Deploy it using secrets on share.streamlit.io

Tarun Jain

Tarun Jain is a Data Scientist at AI Planet, a Belgium based AI Startup. He is also recognised as Google Developer Expert in AI/ML. He is also part of GSoC'24 at RedHenLab.

Building Efficient RAG pipeline using Open Source LLMs .ical 08-24, 13:45–15:15 (Asia/Kuala_Lumpur), Classroom

Problem Statement

Session Outline

Building Efficient RAG pipeline using Open Source LLMs
.ical

08-24, 13:45–15:15 (Asia/Kuala_Lumpur), Classroom