Module 3 Additional Resources#
LLM Gateway Access#
How This Module Works#
All LLM access in this module uses a gateway abstraction — a centralised HTTPS endpoint that handles authentication, rate limiting, and model routing. This mirrors enterprise practice.
Default configuration:
LLM_BASE_URL—https://jbchat.jonbowden.com.ngrok.app(JBChat gateway)LLM_API_KEY— provided by your instructor (optional)
Models Used in This Module#
phi3:mini - Microsoft’s compact model (mandatory)
llama3.2:1b - Meta’s efficient small model (optional comparison)
Official Documentation#
API References#
Python Requests Library - HTTP for Humans
ngrok Documentation - Secure tunnelling for HTTPS access
JBChat Endpoints#
POST /chat/direct- Direct LLM chat (used in this module)POST /chat- Chat with RAG supportPOST /upload- Upload files for RAGGET /files- File managementGET /search- Search indexed contentGET /health- Health check
Enterprise AI Concepts#
OpenAI Best Practices - Prompt engineering guide
Anthropic Prompt Engineering - Claude-specific techniques
Video Tutorials#
LLM Fundamentals#
But what is a GPT? Visual intro to transformers - 3Blue1Brown
Intro to Large Language Models - Andrej Karpathy
Ollama Setup#
Running Ollama Locally and Accessing It from Google Colab via Pinggy - CodeVision tutorial showing how to set up Ollama on your laptop and expose it via Pinggy tunnel for use in Colab notebooks
Run LLMs Locally with Ollama - Getting started guide
Ollama Tutorial - Comprehensive walkthrough
Further Reading#
Enterprise AI Safety#
AI Alignment Research - Anthropic’s safety research
NIST AI Risk Management Framework - Risk management standards
Hallucinations and Reliability#
Survey of Hallucination in NLG - Academic paper on hallucinations
Factuality in LLMs - Research on accuracy
Practice Platforms#
LangChain - Framework for LLM applications
LlamaIndex - Data framework for LLM apps
HuggingFace - Model hub and experimentation
Preparing for Module 5 (RAG)#
Module 5 will cover Retrieval-Augmented Generation (RAG). To prepare:
Understand embeddings - How text is converted to vectors
Learn about vector databases - FAISS, Chroma, Pinecone
Review chunking strategies - How to split documents for retrieval
Preview Resources#
RAG Explained - Introduction to RAG
FAISS Documentation - Facebook’s vector search library