源码级别解析 · Stanford NLP · Source Code Analysis
2026-04-17 | 每日技术深度解读
Think of DSPy as a higher-level language for AI programming, like the shift from assembly to C
DSPy shifts focus from tinkering with prompt strings to programming with structured modules
Each concept addresses a specific aspect of building robust AI systems
Architecture showing how signatures define behavior, modules implement strategies, and optimizers compile programs
Signatures separate what the AI should do from how it does it
class MathQuestion(dspy.Signature):
"""Solve mathematical word problems."""
question: str = dspy.InputField(desc="The mathematical problem to solve")
answer: float = dspy.OutputField(desc="The numerical answer")
reasoning: str = dspy.OutputField(desc="Step-by-step reasoning")
Signature defines the interface and behavior for a mathematical problem solver
DSPy automatically converts signatures into effective prompts
# DSPy automatically converts this signature:
class SentimentAnalysis(dspy.Signature):
text: str = dspy.InputField()
sentiment: str = dspy.OutputField()
# Into optimized prompts like:
"""Analyze the sentiment of the following text:
Text: {text}
Sentiment (positive/negative/neutral):"""
DSPy handles the low-level prompt engineering automatically
Each module type provides a different strategy for LM interaction
# Simple prediction module
predict = dspy.Predict("question -> answer")
result = predict(question="What is 2+2?")
print(result.answer) # "4"
# Chain of Thought module
cot = dspy.ChainOfThought("question -> answer: float")
math_result = cot(question="What's the probability of rolling a 7 with two dice?")
print(math_result.reasoning) # Detailed reasoning
print(math_result.answer) # 0.166667
Modules provide higher-level abstractions over direct LM calls
Chain of Thought helps LMs break down complex problems
class MathProblemSolver(dspy.Module):
def __init__(self):
self.solve = dspy.ChainOfThought("question -> answer: float")
def forward(self, question):
result = self.solve(question=question)
return dspy.Prediction(
answer=result.answer,
reasoning=result.reasoning
)
Chain of Thought modules automatically include reasoning in their output
ReAct enables complex problem solving through iterative reasoning and tool use
def search_wikipedia(query: str) -> list[str]:
"""Search Wikipedia for relevant information"""
results = dspy.ColBERTv2(url="http://20.102.90.50:2017/wiki17_abstracts")(query, k=3)
return [x["text"] for x in results]
def calculate(expression: str) -> float:
"""Evaluate mathematical expression"""
return dspy.PythonInterpreter({}).execute(expression)
# Create ReAct agent with tools
react = dspy.ReAct("question -> answer", tools=[search_wikipedia, calculate])
result = react(question="What is 9362158 divided by the year David Gregory was born?")
ReAct agents can use multiple tools to solve complex problems
Multiple perspectives lead to more robust answers
Multiple reasoning chains are compared to select the best result
Program of Thought combines code generation with execution
Optimizers replace manual prompt engineering with automatic optimization
Each optimizer addresses different aspects of system optimization
import dspy
from dspy.datasets import HotPotQA
# Configure language model
dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))
# Define training data
trainset = [x.with_inputs('question') for x in HotPotQA(train_seed=2024, train_size=500).train]
# Create ReAct agent
react = dspy.ReAct("question -> answer", tools=[search_wikipedia])
# Optimize with MIPROv2
tp = dspy.MIPROv2(metric=dspy.evaluate.answer_exact_match, auto="light", num_threads=24)
optimized_react = tp.compile(react, trainset=trainset)
MIPROv2 can improve ReAct performance from 24% to 51% on HotPotQA
MIPROv2 considers multiple objectives simultaneously
MIPROv2 uses a sophisticated multi-stage optimization process
BootstrapFewShotRS improves example selection quality
BootstrapFinetune fine-tunes model weights for better performance
GEPA provides better examples through structural understanding
Different optimizers can work together for better results
DSPy supports a wide range of language model providers
# OpenAI configuration
lm_openai = dspy.LM("openai/gpt-5-mini", api_key="YOUR_API_KEY")
dspy.configure(lm=lm_openai)
# Anthropic configuration
lm_anthropic = dspy.LM("anthropic/claude-sonnet-4-5-20250929", api_key="YOUR_API_KEY")
dspy.configure(lm=lm_anthropic)
# Local Ollama configuration
lm_local = dspy.LM("ollama_chat/llama3.2:1b", api_base="http://localhost:11434", api_key="")
dspy.configure(lm=lm_local)
# Databricks configuration
lm_databricks = dspy.LM("databricks/databricks-llama-4-maverick",
api_key="YOUR_TOKEN",
api_base="YOUR_URL")
DSPy provides a unified API for different model providers
DSPy makes it easy to build RAG systems
class RAG(dspy.Module):
def __init__(self, num_docs=5):
self.num_docs = num_docs
self.retrieve = dspy.ColBERTv2(url="http://20.102.90.50:2017/wiki17_abstracts")
self.respond = dspy.ChainOfThought("context, question -> response")
def forward(self, question):
# Retrieve relevant documents
context = self.retrieve(question, k=self.num_docs)
# Generate response with context
return self.respond(context=context, question=question)
DSPy RAG systems integrate retrieval and generation seamlessly
DSPy enables building complex multi-stage pipelines
class ArticleGenerator(dspy.Module):
def __init__(self):
self.build_outline = dspy.ChainOfThought(Outline)
self.draft_section = dspy.ChainOfThought(DraftSection)
def forward(self, topic):
# Generate article outline
outline = self.build_outline(topic=topic)
sections = []
# Draft each section
for heading, subheadings in outline.section_subheadings.items():
section = self.draft_section(
topic=outline.title,
section_heading=f"## {heading}",
section_subheadings=[f"### {s}" for s in subheadings]
)
sections.append(section.content)
return dspy.Prediction(title=outline.title, sections=sections)
Multi-stage pipelines can be optimized end-to-end
DSPy supports various classification scenarios
from typing import Literal
class SentimentClassification(dspy.Signature):
"""Classify sentiment of text with toxicity score."""
text: str = dspy.InputField()
sentiment: Literal["positive", "negative", "neutral"] = dspy.OutputField()
toxicity: float = dspy.OutputField()
class Classifier(dspy.Module):
def __init__(self):
self.classify = dspy.Predict(SentimentClassification)
def forward(self, text):
return self.classify(text=text)
DSPy classifiers can output structured data including confidence scores
DSPy can extract structured information from text
class ExtractInfo(dspy.Signature):
"""Extract structured information from text."""
text: str = dspy.InputField()
title: str = dspy.OutputField()
headings: list[str] = dspy.OutputField()
entities: list[dict[str, str]] = dspy.OutputField(
desc="a list of entities and their metadata"
)
extractor = dspy.Predict(ExtractInfo)
text = "Apple Inc. announced its latest iPhone 14 today. The CEO, Tim Cook, highlighted its new features in a press release."
result = extractor(text=text)
print(result.entities) # [{'name': 'Apple Inc.', 'type': 'Organization'}, ...]
DSPy extractors can produce structured outputs from raw text
DSPy can generate and refine code
class CodeGenerator(dspy.Signature):
"""Generate Python code for given specification."""
specification: str = dspy.InputField()
code: str = dspy.OutputField(desc="Python code implementation")
explanation: str = dspy.OutputField(desc="Code explanation")
class ProgramGenerator(dspy.Module):
def __init__(self):
self.generate = dspy.ProgramOfThought(CodeGenerator)
def forward(self, spec):
return self.generate(specification=spec)
ProgramOfThought generates executable code from specifications
DSPy provides built-in metrics and supports custom evaluation
import dspy
# Built-in metrics
answer_exact_match = dspy.evaluate.answer_exact_match
semantic_f1 = dspy.evaluate.SemanticF1(decompositional=True)
# Custom evaluation function
def custom_accuracy(prediction, target, trace=None):
"""Custom accuracy metric"""
# Custom logic for your specific task
return prediction.answer.lower().strip() == target.answer.lower().strip()
# Evaluate system
evaluator = dspy.Evaluate(devset=devset, num_threads=24, display_progress=True)
accuracy = evaluator(reag_system, metric=custom_accuracy)
Evaluation metrics help measure and improve system performance
DSPy programs can run on different models without changes
# Define system once
math_system = dspy.ChainOfThought("question -> answer: float")
# Switch between models easily
# OpenAI
dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))
result_gpt = math_system(question="What is 15 * 27?")
# Anthropic
dspy.configure(lm=dspy.LM("anthropic/claude-sonnet-4-5-20250929"))
result_claude = math_system(question="What is 15 * 27?")
# Compare results
print(f"GPT-4o-mini: {result_gpt.answer}")
print(f"Claude: {result_claude.answer}")
Same system, different models - easy comparison and optimization
DSPy is built on solid research foundation
DSPy powers many state-of-the-art AI systems
DSPy has a thriving open-source ecosystem
Easy installation and comprehensive documentation
# Install DSPy
# pip install -U dspy
import dspy
# Configure language model
dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))
# Create a simple module
math = dspy.ChainOfThought("question -> answer: float")
# Use it
result = math(question="What is the square root of 144?")
print(f"Answer: {result.answer}")
print(f"Reasoning: {result.reasoning}")
Simple example demonstrates DSPy's ease of use
DSPy systems can be optimized for production use
# Enable caching
dspy.configure(lm=dspy.LM("openai/gpt-4o-mini", cache=True))
# Batch processing
def batch_process(questions):
with dspy.settings.context(lm=dspy.LM("openai/gpt-4o-mini", batch_size=32)):
return [math(q) for q in questions]
# Asynchronous execution
import asyncio
async def async_process(question):
result = await math.forward(question)
return result
Various techniques to improve performance
DSPy provides tools for debugging and monitoring
DSPy offers many advanced features for complex systems
DSPy represents a paradigm shift in AI system design
| Feature | Traditional Prompting | DSPy |
|---|---|---|
| Code Structure | String manipulation | Declarative modules |
| Optimization | Manual tweaking | Automatic optimization |
| Portability | Model-specific | Cross-model |
| Maintenance | Brittle prompts | Structured code |
| Performance | Manual optimization | Algorithmic optimization |
Follow best practices for better results
DSPy continues to evolve with new capabilities
DSPy represents the future of AI system development
感谢阅读!
访问 https://atcfu.com/ai-articles/dspy-programming-language-models/ 回顾本文