Me brain think how more productive context

This is not a blog post. Just a random note of something I want to build in my free time.

Core Problem

I capture information everywhere: articles I read, code I write, conversations I have, tasks I track. But when I need that context later, I either can't find it or have to manually piece it together. LLMs could help, but they don't know about my preferences, past work, or the connections between my scattered data. I want a system where I throw everything in once, and the right context automatically surfaces when any tool or agent needs it, without me having to explain myself repeatedly.

Storage Layer

Personal State

Schedule, active projects, screen time metrics
Preferences: languages (spoken/code), tooling choices, writing style
Auto-discover preferences via integrations (GitHub repo analysis for framework patterns, language frequency)

Documents & Artifacts

Articles read, research outputs (ChatGPT/Gemini), pinned Discord messages
Lightweight references to external stores (e.g. Commit messages -> pointer to file@revision@git-repo, project summary -> repo link)

Usage Layer

Context-Aware Prompting

Generate dynamic system prompts per session (e.g., OpenCode picks my preferred language without asking)
Clarifying questions store answers permanently—never ask twice
Example: "No language stated → use favorite for use case. Unknown use case → ask once, remember forever"

Intelligent Retrieval

Semantic search across everything ("brainrot IDE YC funded" → finds article from a month ago regardless of source)
Top-k embedding matches without manual tagging/folders

Proactive Assistance

Productivity dashboard: screen time, task completion, habit analysis with AI suggestions
Delegate planning: meal prep, shopping lists—compliant with stored preferences
These should be integrations and enabled by the retrieval interface efficiently

Dual-Mode Retrieval

Agent Navigation (Smart Path): LLM explores context tree via tool calls, reasons about where to search, executes queries across storage layers
Workflow Execution (Fast Path): Agent-discovered paths compile into executable workflows—direct database/API calls with sub-100ms response for repeated queries
System learns: first query is slow (agent explores), subsequent identical queries are instant (execute cached workflow)

Design Principles

Multi-Context Awareness

No single embedding store—different use cases need isolated context to avoid pollution
System determines which context pools are relevant per query

Zero Client Complexity

Clients throw data, system handles organization/navigation/fetching
Abstracts away storage topology from integrations

Hierarchical Embedding Architecture

Fuzzy tree: embeddings as keys pointing to sub-databases
Traverse with max depth (shallow levels use embeddings, deep levels use traditional indices for speed)
Avoids embedding overhead when document sets are small or queries are sequential

Workflow Learning

Agent navigation paths become reusable workflows
Common patterns compile into parametric templates (e.g., "articles on {topic}" → direct vector search)
Workflows update when schemas change or new data sources added

Example Usage Flows

Fast Path: Productivity Dashboard (Repeated Query)

First time:

User: "Show my productivity dashboard"
Agent navigates tree: identifies core/screen_time, core/schedule, indices/git_commits
Executes SQL queries + GitHub API calls, combines results
System stores workflow as executable script

Subsequent times:

User: "Show my productivity dashboard"
Pattern match to stored workflow
Execute script directly: ~50ms response
No agent invoked

Of course, it's dumb to ask an LLM every time. It makes 0 sense to me the people who like interacting with AI. Would be a web dashboard instead that polls the workflow (via ID rather than natural language) & shows it visually with pretty graphs and stuff. This means the format returned has to be structured and standardized.

Smart Path: Code Context Discovery (New Query)

User: "Write a web scraper similar to my past projects"
Agent gets context tree, navigates to code_repos branch
Vector search on repo keywords: finds projects tagged "web scraping"
Fetches repo metadata: sees Python + requests/beautifulsoup pattern
Cross-references preferences: confirms Python is preferred language
Optionally fetches file summaries for implementation details
Generates system prompt with: language preference + framework patterns + example repo structure

Smart Path: Historical Search (Ambiguous Intent)

User: "What did I do with that API refactor Tuesday?"
Agent unsure if looking for: commit message, document about refactor, or schedule entry
Navigates multiple branches in parallel:
- documents/commits: searches "API refactor" in commit messages for all repos
- core/schedule: checks Tuesday's calendar for related events
- communication/messages: searches Discord/Slack for discussions
Finds commit: "refactor: migrate auth API to v2" on Tuesday 3PM
Retrieves commit details + links to affected files in repo
Returns: commit message, changed files, related calendar entry

The return here would be natural language, since this is a natural language query. Would still have structured output for sources and data used.

Fast Path Evolution: Parametric Template

After 3-4 queries like "articles on RAG", "articles on agents", "articles on embeddings":

System identifies pattern: searching articles by topic
Compiles template workflow:

def articles_on_topic(topic: str):
    embedding = embed(topic)
    results = vector_search("reading_context", embedding, top_k=10)
    return results

Future queries: "articles on X" → instant execution with topic parameter
No tree navigation needed

Smart Path: Cross-Domain Query

User: "Compare my Python vs Rust project activity this month"
Agent identifies need for multiple context pools
Navigates to code_repos → filters by language
Navigates to indices/git_commits → filters by date range
Executes:
- Count commits per language
- Analyze LOC changes
- Check screen time in relevant IDEs from core/screen_time
Synthesizes comparison report
Stores workflow: "compare_language_activity(lang1, lang2, timeframe)"

I've just bought konteksto.dev... The number of domains in my project graveyard grows...