Architectures for
Intelligence You Own
Investigating local-first AI systems, cognitive memory architectures,
graph-based reasoning, and computational sovereignty.
An ongoing investigation by Daniel Kliewer
Dependence Is a Design Decision
Most AI today is built on infrastructure someone else controls. These are the structural implications.
Infrastructure Dependence
Every inference depends on infrastructure you do not control. This is an architectural constraint, not a feature limitation.
Data Boundaries
Querying remote models means transmitting data outside your authority. This shapes what you can build and where.
Recurring Access
Per-token pricing makes long-running or autonomous systems economically fragile. Usage is metered by someone else's meter.
Protocol Coupling
Provider-specific APIs tie your system to a single access path. Changing infrastructure requires rewriting the interface layer.
Architectures for Sovereignty
Rebuilding the stack so every layer — from inference to memory — is owned and understood by its operator.
Local Inference
Models run on your hardware. Ollama, llama.cpp, and quantized architectures provide the runtime.
Composable Systems
RAG pipelines, knowledge graphs, and autonomous agents combine into architectures, not point solutions.
Data Authority
Processing stays within your network. This is a structural property of the architecture.
Unmetered Operation
No rate limits or usage caps — there is no external gate. The only constraint is your hardware.
Everything You Need

Sovereign AI: An Architectural Investigation into Local-First Intelligence
by Daniel Kliewer · Paperback · 72 pages
An examination of the architecture of intelligence that you own — from first principles through production deployment.
Architectural Layers
Foundation Models
Understanding and running local LLMs
Retrieval Architectures
RAG pipelines with local embeddings
Structured Knowledge
Graph-based reasoning systems
Agent Systems
Autonomous, offline agent architectures
Tool Integration
Connecting AI via standardized protocols
Full-Stack AI
Complete application architectures
Persona Routing
Dynamic expert selection across models
Evaluation
Measuring and improving system behavior
Production Security
Privacy-preserving deployment architectures
Research Through Code
workflow
A structured methodology for integrating AI into software development. Explores how systems thinking can guide AI-assisted engineering.
autoblog01
Investigates RAG-driven content generation as an architectural pattern. Can local LLMs drive the full content pipeline end to end?
sovereignBank
Explores whether autonomous agents can maintain persistent, evolving memory without cloud infrastructure. A seven-layer cognitive architecture.
SynthInt
Examines mixture-of-experts routing through dynamic personas. Can synthetic intelligence emerge from locally-hosted specialized models?
chrome-ai-filename-generator
Studies the interface between local inference and everyday workflows. A concrete experiment in on-device AI utility.
ConCreat
Investigates local text-to-speech pipelines for multimedia content. What are the boundaries of fully offline content generation?
Architectural Investigations
Sovereign Memory Bank
An autonomous cognitive memory system that transforms documents into evolving knowledge graphs — no cloud required.
The Sovereignty Manifesto
Why computational sovereignty is a prerequisite for meaningful AI ownership and why local-first is an architectural necessity.
Your First Local AI
Running your own AI on your laptop with Ollama. A practical entry point into local-first intelligence.
The Architecture Is Yours
The book documents the architecture. The code implements it. What you build from them is your own.