Module 6 Resources — LLM APIs (Python)

Module 6 Resources — LLM APIs (Python)#

CodeVision Academy

This page provides additional resources for deepening your understanding of building robust LLM API integrations.

Official Documentation#

Python HTTP Libraries#

Requests Library Documentation — The de facto standard for HTTP in Python. Essential reading for understanding timeouts, sessions, and error handling.
HTTPX Documentation — Modern async-capable HTTP client. Good for high-concurrency LLM applications.

LLM Provider APIs#

OpenAI API Reference — The most widely-used LLM API format. Many providers follow this schema.
Ollama API Documentation — Local LLM server API used in this course.
Anthropic Claude API — Alternative LLM provider with similar patterns.

Reliability Patterns#

Retry & Backoff#

Exponential Backoff (Wikipedia) — Conceptual overview of the retry strategy.
Tenacity Library — Production-ready retry library for Python with decorators.
Backoff Library — Lightweight alternative for retry with backoff.

Circuit Breakers#

Circuit Breaker Pattern (Martin Fowler) — Pattern for preventing cascade failures in distributed systems.
PyBreaker Library — Python implementation of the circuit breaker pattern.

JSON & Validation#

JSON Schema#

JSON Schema — Formal specification for validating JSON structure.
jsonschema Library — Python library for JSON Schema validation.
Pydantic Documentation — Data validation using Python type hints. Excellent for LLM response validation.

Structured Output from LLMs#

OpenAI JSON Mode — Native JSON mode support in OpenAI API.
Instructor Library — Library for extracting structured data from LLMs using Pydantic.

Testing#

Mocking in Python#

unittest.mock Documentation — Python’s built-in mocking library.
pytest-mock — Pytest plugin for mocking.
responses Library — Mock HTTP requests in Python tests.

Testing Best Practices#

Testing External APIs (Real Python) — Guide to testing code that calls external APIs.

Production Concerns#

Logging#

Python Logging Documentation — Standard library logging module.
structlog Library — Structured logging for Python applications.

Monitoring & Observability#

OpenTelemetry Python — Observability framework for tracing API calls.
Prometheus Python Client — Metrics collection for monitoring.

Cost Management#

OpenAI Tokenizer — Tool for counting tokens to estimate costs.
tiktoken Library — Fast BPE tokenizer for token counting in Python.

Enterprise Patterns#

API Design#

REST API Design Best Practices — Principles that apply to LLM API client design.
API Gateway Patterns — Patterns for managing API access at scale.

Security#

OWASP API Security — Security considerations for API integrations.
Python Secrets Management — Secure handling of API keys.

Books & In-Depth Reading#

“Designing Data-Intensive Applications” by Martin Kleppmann — Chapter on distributed systems reliability applies directly to LLM integration.
“Release It!” by Michael Nygard — Patterns for building resilient production systems.
“Building LLM Apps” by Chip Huyen — Modern guide to LLM application development.

Tools Used in This Module#

Tool	Purpose	Installation
`requests`	HTTP client	`pip install requests`
`json`	JSON parsing	Built-in
`dataclasses`	Structured data	Built-in (Python 3.7+)
`typing`	Type hints	Built-in

Practice Projects#

Build a CLI LLM client — Create a command-line tool that takes prompts and returns structured responses with proper error handling.
Implement a caching layer — Build a response cache with TTL (time-to-live) expiration for LLM calls.
Create a mock server — Build a simple Flask/FastAPI mock server that simulates LLM API responses for testing.
Add monitoring — Instrument an LLM client with metrics (latency histograms, error rates, token counts).
Build a RAG pipeline — Combine Module 5 (embeddings) and Module 6 (LLM API) skills to build a complete question-answering system.

Quick Reference#

Retry Logic Template#

def retry_with_backoff(fn, max_retries=3, base_delay=1.0):
    for attempt in range(max_retries + 1):
        try:
            return fn()
        except Exception as e:
            if attempt == max_retries:
                raise
            time.sleep(base_delay * (2 ** attempt))

JSON Validation Template#

def validate_response(text, required_fields):
    data = json.loads(text.strip())
    missing = [f for f in required_fields if f not in data]
    if missing:
        raise ValueError(f"Missing fields: {missing}")
    return data

Timeout Pattern#

response = requests.post(
    url,
    json=payload,
    timeout=(5, 60)  # (connect, read)
)

Module 6 Resources — LLM APIs (Python)

Contents

Module 6 Resources — LLM APIs (Python)#

Official Documentation#

Python HTTP Libraries#

LLM Provider APIs#

Reliability Patterns#

Retry & Backoff#

Circuit Breakers#

JSON & Validation#

JSON Schema#

Structured Output from LLMs#

Testing#

Mocking in Python#

Testing Best Practices#

Production Concerns#

Logging#

Monitoring & Observability#

Cost Management#

Enterprise Patterns#

API Design#

Security#

Books & In-Depth Reading#

Tools Used in This Module#

Practice Projects#

Quick Reference#

Retry Logic Template#

JSON Validation Template#

Timeout Pattern#