Me brain think how more productive context
This is not a blog post. Just a random note of something I want to build in my free time.
Core Problem
I capture information everywhere: articles I read, code I write, conversations I have, tasks I track. But when I need that context later, I either can't find it or have to manually piece it together. LLMs could help, but they don't know about my preferences, past work, or the connections between my scattered data. I want a system where I throw everything in once, and the right context automatically surfaces when any tool or agent needs it, without me having to explain myself repeatedly.
Storage Layer
Personal State
- Schedule, active projects, screen time metrics
- Preferences: languages (spoken/code), tooling choices, writing style
- Auto-discover preferences via integrations (GitHub repo analysis for framework patterns, language frequency)
Documents & Artifacts
- Articles read, research outputs (ChatGPT/Gemini), pinned Discord messages
- Lightweight references to external stores (e.g. Commit messages -> pointer to file@revision@git-repo, project summary -> repo link)
Usage Layer
Context-Aware Prompting
- Generate dynamic system prompts per session (e.g., OpenCode picks my preferred language without asking)
- Clarifying questions store answers permanently—never ask twice
- Example: "No language stated → use favorite for use case. Unknown use case → ask once, remember forever"
Intelligent Retrieval
- Semantic search across everything ("brainrot IDE YC funded" → finds article from a month ago regardless of source)
- Top-k embedding matches without manual tagging/folders
Proactive Assistance
- Productivity dashboard: screen time, task completion, habit analysis with AI suggestions
- Delegate planning: meal prep, shopping lists—compliant with stored preferences
- These should be integrations and enabled by the retrieval interface efficiently
Dual-Mode Retrieval
- Agent Navigation (Smart Path): LLM explores context tree via tool calls, reasons about where to search, executes queries across storage layers
- Workflow Execution (Fast Path): Agent-discovered paths compile into executable workflows—direct database/API calls with sub-100ms response for repeated queries
- System learns: first query is slow (agent explores), subsequent identical queries are instant (execute cached workflow)
Design Principles
Multi-Context Awareness
- No single embedding store—different use cases need isolated context to avoid pollution
- System determines which context pools are relevant per query
Zero Client Complexity
- Clients throw data, system handles organization/navigation/fetching
- Abstracts away storage topology from integrations
Hierarchical Embedding Architecture
- Fuzzy tree: embeddings as keys pointing to sub-databases
- Traverse with max depth (shallow levels use embeddings, deep levels use traditional indices for speed)
- Avoids embedding overhead when document sets are small or queries are sequential
Workflow Learning
- Agent navigation paths become reusable workflows
- Common patterns compile into parametric templates (e.g., "articles on {topic}" → direct vector search)
- Workflows update when schemas change or new data sources added
Example Usage Flows
Fast Path: Productivity Dashboard (Repeated Query)
First time:
- User: "Show my productivity dashboard"
- Agent navigates tree: identifies
core/screen_time,core/schedule,indices/git_commits - Executes SQL queries + GitHub API calls, combines results
- System stores workflow as executable script
Subsequent times:
- User: "Show my productivity dashboard"
- Pattern match to stored workflow
- Execute script directly: ~50ms response
- No agent invoked
Of course, it's dumb to ask an LLM every time. It makes 0 sense to me the people who like interacting with AI. Would be a web dashboard instead that polls the workflow (via ID rather than natural language) & shows it visually with pretty graphs and stuff. This means the format returned has to be structured and standardized.
Smart Path: Code Context Discovery (New Query)
- User: "Write a web scraper similar to my past projects"
- Agent gets context tree, navigates to
code_reposbranch - Vector search on repo keywords: finds projects tagged "web scraping"
- Fetches repo metadata: sees Python + requests/beautifulsoup pattern
- Cross-references
preferences: confirms Python is preferred language - Optionally fetches file summaries for implementation details
- Generates system prompt with: language preference + framework patterns + example repo structure
Smart Path: Historical Search (Ambiguous Intent)
- User: "What did I do with that API refactor Tuesday?"
- Agent unsure if looking for: commit message, document about refactor, or schedule entry
- Navigates multiple branches in parallel:
documents/commits: searches "API refactor" in commit messages for all reposcore/schedule: checks Tuesday's calendar for related eventscommunication/messages: searches Discord/Slack for discussions
- Finds commit: "refactor: migrate auth API to v2" on Tuesday 3PM
- Retrieves commit details + links to affected files in repo
- Returns: commit message, changed files, related calendar entry
The return here would be natural language, since this is a natural language query. Would still have structured output for sources and data used.
Fast Path Evolution: Parametric Template
After 3-4 queries like "articles on RAG", "articles on agents", "articles on embeddings":
- System identifies pattern: searching articles by topic
- Compiles template workflow:
def articles_on_topic(topic: str):
embedding = embed(topic)
results = vector_search("reading_context", embedding, top_k=10)
return results
- Future queries: "articles on X" → instant execution with topic parameter
- No tree navigation needed
Smart Path: Cross-Domain Query
- User: "Compare my Python vs Rust project activity this month"
- Agent identifies need for multiple context pools
- Navigates to
code_repos→ filters by language - Navigates to
indices/git_commits→ filters by date range - Executes:
- Count commits per language
- Analyze LOC changes
- Check screen time in relevant IDEs from
core/screen_time
- Synthesizes comparison report
- Stores workflow: "compare_language_activity(lang1, lang2, timeframe)"
I've just bought konteksto.dev... The number of domains in my project graveyard grows...