The object of my research is not really AI. It is the digital environment.
At the longest horizon, I am working on UX for the post-smartphone era: the lived experience of computing once the phone-and-chatbot stack stops looking like the final form.
Most current AI products make the model the center of the story. The human becomes a prompt source, the interface becomes a chat box, and the rest of the digital environment stays mostly unchanged. I want to invert that. The environment is the subject. The human is the one piloting it. AI is one material among several.
That inversion changes the design question. If AI is the object, the question is usually: how do we make the model better? If the environment is the object, the question becomes: does this fit how I actually want to think, work, remember, read, write, and live?
The same materials can erode agency or amplify it.
The interesting question is not whether AI is good or bad. The same components can be arranged two ways. The difference is architectural: what gets remembered, what stays inspectable, who decides what counts as known, and whether the system helps the human return to the thread or quietly takes the thread away.
- Remembers instead of youYou lose the thread.
- Answers without sourcesYou stop checking.
- Actions behind glassYou stop knowing what was done.
- Knowledge in vendor silosYou cannot take it with you.
- Optimized for engagementYour attention drifts.
- Helps you rememberYou keep the thread without depending on a device to hold it for you.
- Answers carry provenanceYou can audit them.
- Actions leave tracesYou can review what happened.
- Knowledge lives in owned filesIt moves with you.
- Resurfaces what matteredYour attention compounds.
Amplification requires choices that rule things out. No durable memory without provenance. No important action without a trace. No silent overwriting when supersession would keep the history legible. No ontology so rigid that messy life cannot enter the system.
The substrate is the product before there is a product.
The important thing I have been building is not a chatbot. It is a filesystem-native knowledge substrate: a way for life and work signals to become durable, queryable, inspectable, and revisable without disappearing into a black box.
The substrate matters because surfaces come and go. A web page, a terminal command, a voice interaction, a paper export, an e-ink reading loop: these should be replaceable projections. The durable value is underneath them.
Capture
Raw signals enter from voice, notes, messages, email, screenshots, calendar, and work artifacts.
Durability
Important material lands as readable files, dated partitions, logs, or markdown with metadata.
Identity
Objects, events, agents, and sources get stable names so the system can point back to them.
Observation
SQL views, search indexes, evals, and lints make the substrate visible from several angles.
Projection
Interfaces render the same substrate for different jobs: operating, reading, reviewing, deciding.
Repair
Corrections, supersession, and review loops keep knowledge alive instead of pretending it was perfect.
Files are the substrate, not the UI.
Readable files are where durable truth begins. Dashboards and agents are projections over the files, not replacements for them.
Traceability is a design goal, not a completed fact.
Many operations already leave event, lineage, hook, or message traces. Some coverage is still uneven. The system should name those gaps rather than hide them.
Lint, do not lock.
Messy inputs are accepted, then made visible through lints, evals, and observation lenses. Non-compliance becomes data.
The loop matters more than the model.
Capture, interpretation, action, resurfacing, and audit are the product. Models can change; the loop must still make sense.
One environment, two tempos.
An environment that can be steered in the moment needs different architecture from one that supports deep study over weeks. Trying to make everything fast destroys reflection. Trying to make everything thoughtful destroys responsiveness.
The steering loop
Capture, voice, status, routing, simple actions, notifications, current context, and time-sensitive anticipation. This layer has to feel alive.
The thinking loop
Deep study, synthesis, long-form work, reflection, memory consolidation, careful agent work, paper, and e-ink. This layer is allowed to be slow.
The design implication is concrete: voice and capture deserve production-critical latency budgets; research and synthesis can happen as patient background work. The point is not one universal assistant. It is an environment with multiple loops, each honest about its tempo.
Personal first is the method, not the ceiling.
The current system inherits from Griffe and related experiments, but the thing I am building now is broader than a named assistant. It is a personal knowledge substrate running on my own files, my own work, my own attention, and my own failure modes.
I am the user, the developer, and the test subject. That is not a limitation. It is the only honest way to study whether a digital environment changes what a person remembers, how they decide, and how they recover the thread after interruption.
A focus group can tell you if a feature is understandable. It cannot tell you whether a system erodes or amplifies agency over weeks. For that, you have to live with it.
A working personal substrate, not yet a finished digital environment.
Voice, notes, messages, HTML reports, PDFs, reMarkable packets, and agent-prepared briefs already form a real daily loop. The system can help me capture, synthesize, review, and recover context across time.
The experience is not yet fluid enough. Too much still depends on explicit prompts, scheduled resurfacing, manual review, and switching between surfaces. The system helps me think, but it does not yet consistently feel like an environment I can pilot.
More voice-native steering, calmer reading surfaces, better resurfacing, more paper and e-ink loops, and less dependence on screens. The goal is not more automation for its own sake; it is a computing experience that gives me more agency, continuity, and space to think.
Dated flow partitions, owned files, agent workspaces, inbox/outbox routing, lineage, event streams, search, SQL projections, evals, lints, and generated artifacts. The system already leaves traces and can rebuild many useful views from source material.
Observability is uneven across model calls, MCP/tool use, agent handoffs, generated claims, and end-to-end latency. The system can produce useful artifacts, but the full path from signal to action to review is not always visible enough.
More complete provenance, better latency views, stronger memory loops, clearer source-to-surface chains, and tighter entropy control as capabilities expand. Agents should become bounded actors inside inspectable feedback loops, not isolated assistants.
The client question is timing, benefit, and complexity.
The useful translation is not "this personal system could become a product." It is: these patterns identify when an organization is ready to make AI operational, where the first gains can be captured, and how much change is really required.
When to open the conversation
When AI pilots create output faster than the organization can absorb it; when knowledge is fragmented across tools; when handoffs depend on a few people; or when the company wants AI progress without becoming locked into one stack.
What maturity is needed
Not a perfect technical stack. The practical minimum is clearer ownership: who owns a workflow, a source of truth, a client context, a decision, and the review of AI-assisted work.
Where benefits start
The first benefits can be individual or team-level: better preparation, fewer lost decisions, faster handoffs, reusable briefs, clearer audit trails, and less dependence on one human or one agent behaving perfectly.
Environment-level UX becomes role-level operating design.
AI is added beside the work: another chatbot, another dashboard, another tab. The team still has to stitch together context, approvals, memory, and action by hand.
Map one role or workflow as an operating environment: what signals matter, what decisions recur, what actions follow, and what must remain inspectable.
Low to medium complexity. Often starts without changing the core stack, by designing a better work surface over existing sources.
Two tempos become the right latency budget for the job.
Everything is treated as one AI workflow: either too slow for action or too shallow for thought. Users stop trusting it because the tempo does not match their work.
Separate fast steering loops from slow thinking loops: capture, routing, status, and next actions on one side; research, synthesis, review, and memory consolidation on the other.
Fast usability gains for operations, sales, executive support, and incident contexts. Complexity rises when the fast loop needs permissions or production integrations.
Capabilities-first architecture becomes a sharper AI roadmap.
The organization has many AI initiatives, but it is unclear what new organizational capability they create or how success should be evaluated.
Reframe the roadmap around capabilities: prepare a client meeting, reconcile sources, monitor drift, explain a decision, produce a weekly brief, or route work across teams.
Medium complexity. The first step can be a roadmap audit before implementation. The benefit is clearer prioritization and less tool-driven AI theater.
Living memory becomes resilience against handoff loss.
Knowledge exists, but it is not reusable by people or agents. Departures, vacations, client transitions, and project changes create avoidable memory loss.
Choose one domain and define the memory loop: source material, owner, freshness rule, retrieval path, canonical summary, and resurfacing moments.
Low to medium complexity. Benefits can begin at individual or team level with existing documents, messages, notes, and meeting records.
Substrate observability becomes operational trust.
AI outputs are useful but hard to verify. Sources disagree, transcripts are imperfect, recommendations lack lineage, and nobody can explain why an agent believed something.
Make the chain visible: source, transformation, retrieval, reasoning, action, outcome. Start where risk is highest or where trust blocks adoption.
Medium to high complexity, depending on risk and access. The benefit is resilience: the system can tolerate human or agent failure because evidence and corrections remain inspectable.
Async surfaces become better decision artifacts.
Leaders and teams are overloaded with meetings, dashboards, and chat threads, but still lack a durable view of what changed, what matters, and what needs a decision.
Replace one recurring coordination burden with a high-quality artifact: a briefing, decision memo, narrated report, research packet, or operating review.
Low complexity. This is often the easiest entry point because it creates visible value before deeper stack or governance changes.
Less screen-bound. More voice, more paper, more deep study.
The goal is not to build a better chatbot. The goal is an environment that fits a life I actually want to lead: less default smartphone, more conversation, more paper and e-ink, more deliberate slowness, more long-form memory, and more inspectable automation.
This is also a wager about the next design frontier. The phone-and-chatbot stack will improve, and that will matter. But the more interesting frontier is the environment around it: computing that stops demanding constant visual attention and becomes infrastructure for thought.
I do not know exactly what that environment looks like at maturity. I know enough to build toward it in the small, with real use, real traces, and enough honesty to keep the gaps visible.
The concepts that keep returning.
Part of the work is naming the ideas precisely enough that they can be designed, criticized, and reused. These are the terms that matter most in the current shape of the project.
Computing that starts from an explicit purpose and helps you stop when the purpose is done. The opposite of feed-shaped ambient checking.
The whole lived system around computing: devices, files, agents, screens, voice, attention, memory, notifications, and time.
The feeling and fact of being able to steer the environment: see state, issue intent, inspect results, correct course, and leave safely.
The durable layer underneath surfaces: files, events, facts, identities, views, and traces that make the system rebuildable and inspectable.
A temporary surface over the substrate: a web page, terminal view, paper export, voice response, dashboard, or reading packet.
The system bringing something back at the right time because it still matters, not because a feed wants another interaction.
Accept messy reality first, then make deviations visible through lints, evals, and review loops. Do not block capture just to preserve a perfect schema.
Memory that helps the human keep the thread instead of replacing their judgment: provenance, context, and the ability to audit are part of the memory.
The environment is the thing.
AI and knowledge management are not the destination. They are materials for making the digital environment more pilotable: easier to inspect, easier to repair, easier to leave, and easier to return to without losing the thread.