Read the blog for long-form explanations (plain language, written like a serious article). Notebook export is the auto-generated HTML from Jupyter—useful for code cells. GitHub opens the .ipynb to run or edit locally.
Chunking (fixed, semantic, hierarchical) and embedding models—explained for students and production engineers. Start with Part 1.
Notebook HTML export → Run on GitHub →Structural chunking, boundaries, and why “relevant” fragments still miss the exception clause. Start with Part 1; run the notebook for code.
Notebook HTML export → Run on GitHub →Hybrid BM25 + vectors, fusion (RRF), and logging which leg saved the query. Start with Part 1.
Notebook HTML export → Run on GitHub →Retrieve wide, rerank narrow: quality vs latency vs GPU memory. Start with Part 1.
Notebook HTML export → Run on GitHub →Enforce authorization in retrieval—filters, audits, cross-tenant tests. Start with Part 1.
Notebook HTML export → Run on GitHub →Similarity + version tags, TTL, false hits—cache without lying. Start with Part 1.
Notebook HTML export → Run on GitHub →Incremental updates, stable IDs, deletes, ingestion lag. Start with Part 1.
Notebook HTML export → Run on GitHub →Cheap faithfulness checks, abstention, and rubrics—before the LLM sounds sure. Start with Part 1.
Notebook HTML export → Run on GitHub →Recall@k, slicing, golden sets—retrieval vs generation eval. Start with Part 1.
Notebook HTML export → Run on GitHub →Untrusted chunks, layered defenses, PII boundaries. Start with Part 1.
Notebook HTML export → Run on GitHub →