Personal Learning Agent — Michael Ku Jr

Overview

I was drowning in AI/ML content — YouTube lectures, papers, docs — and re-watching the same explanations because I couldn’t track what I’d already learned. So I built a personal learning agent: a multi-agent RAG system that ingests that content, organizes it into a knowledge base, and answers my questions with citations back to the exact moment in the source video.

It writes everything into my Obsidian vault, so the knowledge base is something I actually own and can browse, not a black box.

The problem

Learning from scattered video content has two failure modes: you can’t easily find the 90 seconds that answers your question, and you have no memory of what you’ve already covered — so you waste time re-learning. I wanted an agent that retrieves precisely and tracks coverage.

Approach

Ingestion pipeline — pulls content (YouTube transcripts first) and structures it into a hybrid knowledge base organized by both concepts and tools, so retrieval works whether I ask “what is a cross-encoder?” or “how do I use LangGraph?”
Adaptive answers with timestamps — responses cite the source video and the exact timestamp, and a watch-tracking layer records what I’ve already seen so material is never repeated.
LangChain → LangGraph refactor — I first built the pipeline in LangChain, then deliberately rebuilt it in LangGraph to benchmark the tradeoffs of explicit graph orchestration against chain-based flow.
Claude + Chroma + Obsidian — Claude (via Claude Code) for reasoning, a Chroma vector store for retrieval, and an Obsidian vault writer so the output is durable, linkable notes.

What I learned

Rebuilding the same system twice — once in LangChain, once in LangGraph — taught me more about agent orchestration than any tutorial could. LangGraph’s explicit state machine makes branching and memory legible; you can see exactly where the agent is and why. That legibility is what makes multi-agent behavior debuggable instead of magic.

What’s next

A Discord surface so I can query the knowledge base conversationally, and spaced-repetition prompts driven by the watch-tracking data.