Available Now on Amazon

Architectures for
Intelligence You Own

Investigating local-first AI systems, cognitive memory architectures,
graph-based reasoning, and computational sovereignty.

An ongoing investigation by Daniel Kliewer

72 Pages
11 Chapters
10+ Projects
The Architectural Constraint

Dependence Is a Design Decision

Most AI today is built on infrastructure someone else controls. These are the structural implications.

Infrastructure Dependence

Every inference depends on infrastructure you do not control. This is an architectural constraint, not a feature limitation.

Data Boundaries

Querying remote models means transmitting data outside your authority. This shapes what you can build and where.

Recurring Access

Per-token pricing makes long-running or autonomous systems economically fragile. Usage is metered by someone else's meter.

Protocol Coupling

Provider-specific APIs tie your system to a single access path. Changing infrastructure requires rewriting the interface layer.

A Different Foundation

Architectures for Sovereignty

Rebuilding the stack so every layer — from inference to memory — is owned and understood by its operator.

Local Inference

Models run on your hardware. Ollama, llama.cpp, and quantized architectures provide the runtime.

Composable Systems

RAG pipelines, knowledge graphs, and autonomous agents combine into architectures, not point solutions.

Data Authority

Processing stays within your network. This is a structural property of the architecture.

Unmetered Operation

No rate limits or usage caps — there is no external gate. The only constraint is your hardware.

The Book

Everything You Need

Sovereign AI: An Architectural Investigation into Local-First Intelligence by Daniel Kliewer

Sovereign AI: An Architectural Investigation into Local-First Intelligence

by Daniel Kliewer · Paperback · 72 pages

An examination of the architecture of intelligence that you own — from first principles through production deployment.

Scope of Investigation

Architectural Layers

Foundation Models

Understanding and running local LLMs

Retrieval Architectures

RAG pipelines with local embeddings

Structured Knowledge

Graph-based reasoning systems

Agent Systems

Autonomous, offline agent architectures

Tool Integration

Connecting AI via standardized protocols

Full-Stack AI

Complete application architectures

Persona Routing

Dynamic expert selection across models

Evaluation

Measuring and improving system behavior

Production Security

Privacy-preserving deployment architectures

Open Source

Research Through Code

All Repositories →

workflow

45

A structured methodology for integrating AI into software development. Explores how systems thinking can guide AI-assisted engineering.

Markdown

autoblog01

22

Investigates RAG-driven content generation as an architectural pattern. Can local LLMs drive the full content pipeline end to end?

sovereignBank

Explores whether autonomous agents can maintain persistent, evolving memory without cloud infrastructure. A seven-layer cognitive architecture.

SynthInt

Examines mixture-of-experts routing through dynamic personas. Can synthetic intelligence emerge from locally-hosted specialized models?

chrome-ai-filename-generator

6

Studies the interface between local inference and everyday workflows. A concrete experiment in on-device AI utility.

JavaScript

ConCreat

1

Investigates local text-to-speech pipelines for multimedia content. What are the boundaries of fully offline content generation?

TypeScript
The Work Is Open

The Architecture Is Yours

The book documents the architecture. The code implements it. What you build from them is your own.