December 9, 2024·8 min

Complete Guide: Building Persona-Aware RAG Systems with Pydantic AI Agents & Tools - Custom Retrieval & Generation

Comprehensive tutorial for implementing persona-driven Retrieval-Augmented Generation systems using Pydantic AI's Agent and Tools APIs, integrating custom user personas with dynamic document retrieval for highly tailored AI responses.

Daniel Kliewer

Author, Sovereign AI

PydanticRAGLLM AgentsPersona GenerationAI ToolingPythonOpenAI APIVector DatabasesRetrieval-Augmented GenerationTutorialPydantic AIAgent DevelopmentCustom AI

From the Book

This is from Sovereign AI: Building Local-First Intelligent Systems.

Get the Book — $88

Complete Guide: Building Persona-Aware RAG Systems with Pydantic AI Agents & Tools - Custom Retrieval & Generation

Below is a comprehensive, step-by-step guide designed for developers looking to combine persona-driven data modeling with Retrieval-Augmented Generation (RAG) using Pydantic AI’s Agent and Tools APIs. We’ll integrate concepts from the PersonaGen07 repository, the RAG example from Pydantic AI, and the Agent and Tools APIs into a cohesive system. By the end, you’ll have a working setup that allows you to define personas, retrieve relevant documents, and produce AI-generated responses customized to each persona’s style and preferences.

1. Introduction

Modern generative AI systems can be greatly enhanced by incorporating external data (for accuracy and recency) and persona-driven customization (for personalization and relevance to specific user profiles). Retrieval-Augmented Generation (RAG) ensures that the model’s output is grounded in reliable data sources, while persona-based logic tailors responses to different user archetypes, such as a student, a marketing professional, or a tech enthusiast.

PersonaGen07 provides a structured way to define personas as JSON files, capturing attributes like communication style, domain interests, and preferred tone. Pydantic AI offers a typed, schema-driven approach to working with AI models, as well as the Agent and Tools APIs that streamline interaction with external data and services. Together, these tools create a system that:

Retrieves context-relevant information dynamically.
Adapts responses based on predefined persona traits.
Maintains a clean, schema-based code structure for reliability and maintainability.

2. Prerequisites

Before we begin, ensure you have the following:

Python 3.9+ recommended.
Access to the OpenAI API or another supported LLM provider (ensure you have an API key).
Pydantic AI library installed.
PersonaGen07 repository cloned locally.

Required Python Packages

pydantic[ai] for Pydantic AI.
openai for interacting with the OpenAI API.
requests if needed for advanced retrieval scenarios.
json (standard library) for handling persona files.

Terminal Setup Commands

bash
1# Clone PersonaGen07 repository
2git clone https://github.com/kliewerdaniel/PersonaGen07.git
3
4# Navigate to your project directory
5cd your-project-directory
6
7# (Optional) Create a virtual environment
8python3 -m venv venv
9source venv/bin/activate  # On Windows: venv\Scripts\activate
10
11# Install dependencies
12pip install pydantic[ai] openai

You’ll also need to set your OPENAI_API_KEY as an environment variable or directly within your code. For example:

bash
1export OPENAI_API_KEY="your_openai_api_key_here"

3. Setup

Cloning PersonaGen07

The PersonaGen07 repository provides a template for persona definitions. We’ll use its JSON format to structure our persona data.

bash
1git clone https://github.com/kliewerdaniel/PersonaGen07.git personas

This command clones the repo into a personas directory. Inside, you’ll find JSON schemas and example persona definitions. You may create your own persona files based on these examples.

Installing Dependencies

We’ve already installed pydantic[ai] and openai. If you plan to use other retrieval methods or vector databases, install them here:

bash
1# Example for Pinecone or FAISS
2pip install pinecone-client

Setting Up API Keys

Make sure your environment is ready:

bash
1export OPENAI_API_KEY="your_openai_api_key_here"

If you use another LLM provider, refer to its documentation on key management.

4. Code Implementation

Step 1: Define and Load Personas

First, create a persona JSON file. For example, personas/student.json:

json
1{
2  "name": "Student",
3  "attributes": {
4    "communication_style": "friendly and explanatory",
5    "interests": ["technology", "mathematics", "science"],
6    "formality": "casual",
7    "reading_level": "beginner"
8  }
9}

This file defines a “Student” persona who prefers casual, friendly explanations. You can create multiple personas—e.g., personas/marketing_expert.json with a more formal, sales-oriented style.

Persona Loading Code (persona_manager.py):

python
1import json
2from pathlib import Path
3
4class PersonaManager:
5    def __init__(self, persona_path: str):
6        persona_file = Path(persona_path)
7        if not persona_file.exists():
8            raise FileNotFoundError(f"Persona file not found: {persona_path}")
9        with persona_file.open('r') as f:
10            self.persona = json.load(f)
11        self.name = self.persona.get("name", "Default")
12        self.attributes = self.persona.get("attributes", {})
13
14    def get_prompt_instructions(self) -> str:
15        style = self.attributes.get("communication_style", "neutral")
16        formality = self.attributes.get("formality", "neutral")
17        return f"Please respond in a {formality}, {style} manner."

This simple class loads persona data and provides a method to generate persona-specific prompt instructions.

Step 2: Set Up a Retriever Function with the Tools API

Pydantic AI’s Tools API allows you to define tools (functions) that can be called by the AI agent to perform certain tasks, such as retrieving documents. For simplicity, let’s implement a dummy retrieval tool. Later, you can integrate a vector database or other data sources.

Tools Setup (tools.py):

python
1from pydantic_ai import tool
2from typing import List
3
4@tool(name="retrieve_documents", description="Retrieve documents based on a query")
5def retrieve_documents(query: str) -> List[str]:
6    # In a production scenario, implement a semantic search here.
7    # For now, we return static documents filtered by a keyword match.
8    docs = [
9        "Document: RAG integrates retrieval with generation.",
10        "Document: Personas help tailor AI responses.",
11        "Document: Using Agents and Tools can streamline RAG pipelines."
12    ]
13    return [doc for doc in docs if query.lower() in doc.lower()]

Step 3: Use the Pydantic AI Agent API for Retrieval and Generation

The Agent API allows you to define an AI agent that can use tools and produce answers. The agent can call retrieve_documents to get content and then incorporate persona instructions into the prompt.

Agent Setup (agent.py):

python
1import os
2from pydantic_ai import Agent, AISettings
3from persona_manager import PersonaManager
4from tools import retrieve_documents
5
6OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
7
8# Initialize Persona
9persona_manager = PersonaManager("personas/student.json")
10
11# Create an agent with the RAG approach
12# The agent can call the 'retrieve_documents' tool to gather context.
13ai_settings = AISettings(
14    model="gpt-4", 
15    api_key=OPENAI_API_KEY,
16    temperature=0.7
17)
18
19agent = Agent(
20    settings=ai_settings,
21    tools=[retrieve_documents]
22)
23
24def persona_aware_query(query: str) -> str:
25    # Fetch persona-specific instructions
26    persona_instructions = persona_manager.get_prompt_instructions()
27    # Prompt structure includes instructions, user query, and a command to retrieve documents
28    prompt = (
29        f"{persona_instructions}\n"
30        f"The user asked: {query}\n"
31        f"Use the 'retrieve_documents' tool if needed. Then answer the user.\n"
32    )
33    # Agent reasoning: The agent can decide to call retrieve_documents(query) before answering.
34    return agent.run(prompt, max_tokens=200)

How This Works:

We define a prompt that instructs the agent on how to respond.
The agent can invoke the retrieve_documents tool to ground its answer.
The persona instructions set the communication style.
The agent’s final answer will incorporate retrieved documents and persona-based style.

Step 4: Customizing the Agent’s Behavior Based on Persona Attributes

You might want to influence not just the style but also the retrieval strategy. For instance, if a persona is interested in “technology,” you could filter documents with a tech focus. Modify the retrieval tool or the prompt generation logic to leverage persona attributes:

python
1def persona_aware_query(query: str) -> str:
2    persona_instructions = persona_manager.get_prompt_instructions()
3    interests = persona_manager.attributes.get("interests", [])
4    # Incorporate interests into the prompt to guide retrieval
5    interest_tags = ", ".join(interests) if interests else "general knowledge"
6
7    prompt = (
8        f"{persona_instructions}\n"
9        f"Persona interests: {interest_tags}\n"
10        f"The user asked: {query}\n"
11        f"Carefully select documents that match persona interests.\n"
12        f"Use the 'retrieve_documents' tool if needed. Then answer the user.\n"
13    )
14
15    return agent.run(prompt, max_tokens=200)

This improved prompt nudges the agent to consider persona interests during the retrieval step.

5. Testing the Integration

Testing with Different Personas

Switch Personas:
Update the persona file in agent.py:

python
1# For a different persona
2persona_manager = PersonaManager("personas/marketing_expert.json")

Run a Query:

bash
1python agent.py

If your agent.py includes a test block:

python
1if __name__ == "__main__":
2    response = persona_aware_query("Explain what RAG is and why it's useful.")
3    print(response)

You should see a response that:

Incorporates retrieved content from retrieve_documents.
Matches the persona’s communication style (e.g., friendly, casual).

Verifying Personalization

Try switching from a “Student” persona to a “Marketing Expert” persona and compare the responses. The “Student” persona might yield more explanatory, beginner-friendly language, while the “Marketing Expert” persona might use more persuasive or marketing-oriented phrasing.

6. Advanced Features

Semantic Search / Vector Databases

For better retrieval results, integrate a vector database like Pinecone. After setting up an index, modify the retrieve_documents tool to query the index and return semantically matched documents:

python
1@tool(name="retrieve_documents", description="Retrieve documents based on a query")
2def retrieve_documents(query: str) -> List[str]:
3    # Example with Pinecone or FAISS
4    # 1. Embed the query
5    # 2. Search the index
6    # 3. Return the top matches
7    # Return doc strings.
8    pass

Extending Persona Attributes

Personas could include more attributes, like a preferred reading level or specific domains of interest. You might adjust the prompt to direct the agent to simplify language or focus on certain topics based on these attributes.

Fine-Tuning and Prompt Engineering

Experiment with different temperature settings for creativity.
Add chain-of-thought or reasoning steps in the prompt to improve the agent’s performance.

7. Conclusion

By integrating PersonaGen07, Pydantic AI’s RAG example, and the Agent and Tools APIs, we’ve built a flexible system that:

Retrieves Relevant Data: Ensures that responses are enriched with external, current information.
Adapts to Personas: Adjusts tone, complexity, and style based on user or application-defined personas.
Is Maintainable and Extensible: Uses Pydantic’s structured approach and JSON-based personas for easy maintenance and scaling.

Next Steps:

Scale Retrieval: Incorporate more advanced retrieval techniques, multiple databases, or third-party APIs.
Persona Refinement: Add new persona attributes and refine how they influence retrieval and generation.
Fine-Tune Models: Consider fine-tuning or using custom models for domain-specific applications.

With this framework in place, you’re well-positioned to build personalized, dynamic, and contextually accurate AI systems that cater to diverse user needs.

Sovereign AI: Building Local-First Intelligent Systems

by Daniel Kliewer · Paperback · 72 pages

The hands-on guide to building AI that runs on your hardware, keeps your data private, and eliminates cloud dependence. Working code included.

Buy on Amazon — $88 See Inside

← Back to all posts