Skip to content

Vault Glossary

Technical terms used across the hinata-v2 vault. Maintained by Hinata (the General).

Vault Architecture

  • hinata-v2: the vault — primary knowledge base at /Users/nnamdi/hinata-v2/. Git-committed, locally stored.
  • supreme-court/: strategy, laws, governance, preferences, and lifecycle rules that govern the vault.
  • the-government/: diataxis-classified outputs from all federation work (reference, how-to, tutorial, explanation).
  • federation/: flat directory containing all agent context (_context.md) and agent definitions (_agent.md). No subdirectories except demoted/.
  • inbox/: handover files, ingested content, pending classification. Entry point for all external input.
  • Sandpit: /Users/nnamdi/Sandpit/hinata/ — scripts, applications, logs, audio. Everything that cannot live in the vault.
  • diataxis: the 4-type documentation classification framework (reference, how-to, tutorial, explanation) governing all output to the-government/.
  • standing rule: a permanently active instruction in CLAUDE.md that fires every turn without explicit invocation.
  • incremental verbatim transcript: the standing rule that appends every exchange (user, agent dispatch, agent result, Hinata response) to the CLI's transcript file per-turn for crash resilience.
  • Z2: the Mac Mini M2 running Proxmox — the always-on data plane for pipelines, pollers, and infrastructure. Accessed via Tailscale.
  • Z2 container: a Proxmox LXC/VM on Z2 running a specific service (e.g. jimmy-neutron_brain-ops container).

Pillars and Command Structure

  • FORGE: the pillar for career, learning, coding, music, reading, productivity, writing, and mastery. Colonel: Levi Ackerman.
  • FOUNDATION: the pillar for work, health, fitness, finances, home, sleep, legal, security, and philosophy. Colonel: Saitama.
  • FLOW: the pillar for transport, networking, side-hustle, creative strategy, sacrifice/opportunity-cost, end-state, culture, and vows. Colonel: Makima.
  • FIRE: the pillar for friendship, adventure, entertainment, nutrition, fashion, dreams, legacy, and partnership. Colonel: Goku.
  • SYNTHESIS: the pillar for art, NLP, marketplaces, audio, research, investment, and surveillance. Colonel: Sung Jinwoo.
  • Colonel: pillar-level synthesis owner who coordinates all commanders within a pillar. The "Spirit Bomb" — deploys accumulated knowledge of all its commanders.
  • Captain: cross-cutting operational rank above Commanders. Captains own intake, routing, infrastructure, and evolution — not a single life domain.
  • Commander: domain specialist — the primary execution unit. ~40 across all pillars.
  • Lieutenant: subordinate operational agent under a Captain. Gates and enriches Commander intake.
  • signal routing: the process of directing detected signals or extracted output to the appropriate commander or system.
  • stage-gate: a decision point between pipeline stages where output is validated before moving forward.

Commanders and Agent Roles

  • Sung Jinwoo: SYNTHESIS colonel — owns intelligence pipelines, voice memo synthesis, and cross-commander signal merging.
  • Levi Ackerman: FORGE colonel — owns career, learning, coding, music, and productivity domains.
  • Saitama: FOUNDATION colonel — owns work, health, finances, home, and security domains.
  • Makima: FLOW colonel — owns networking, transport, side-hustle, creative strategy, and vows.
  • Goku: FIRE colonel — owns friendship, adventure, entertainment, and legacy domains.
  • Nujabes: SYNTHESIS commander for audio sourcing, curation, and content quality.
  • Heimerdinger: SYNTHESIS commander for NLP, ASR transcription, and extraction pipeline design.
  • Zepile: SYNTHESIS commander for marketplace discovery, ecosystem mapping, and arbitrage signals.
  • Madara: SYNTHESIS commander for market surveillance and macro signal detection.
  • Hashirama: SYNTHESIS commander for capital allocation, portfolio growth, and investment theses.
  • L (Lawliet): SYNTHESIS commander for research methods, novel approaches, and scientific rigour.
  • Deidara: SYNTHESIS commander for visual art, creative coding, and code-art.
  • Zuko: FORGE commander for career intelligence, job applications, and professional positioning.
  • Shikamaru: FORGE commander for learning, quiz-based study, SRS, and skill acquisition.
  • Trunks: FORGE commander for coding craft, software engineering, and technical builds.
  • Squidward: FORGE commander for music production, virtuosity, and audio creative work.
  • Ging: FORGE commander for reading, book synthesis, and knowledge extraction.
  • Sasuke: FORGE commander for productivity, focus, and deep work.
  • Jiraiya: FORGE commander for journaling, creative writing, and lyrics.
  • Hisoka: FORGE commander for mastery tracking and level-up progression.
  • Kakashi: FOUNDATION commander for work performance, VMO2, and professional engineering.
  • Allmight: FOUNDATION commander for mental health, wellbeing, and burnout recovery.
  • Zoro: FOUNDATION commander for fitness, training, and physical performance.
  • Bulma: FOUNDATION commander for finances, budgeting, banking, and ISA management.
  • Simba: FOUNDATION commander for home, property, and life infrastructure.
  • Snorlax: FOUNDATION commander for sleep, rest, and recovery routines.
  • Light Yagami: FOUNDATION commander for legal literacy, contracts, and rights.
  • Itachi: FOUNDATION commander for digital security, credentials, and operational security.
  • Thorfinn: FOUNDATION commander for philosophy, ethics, and values.
  • Melfi: FOUNDATION commander for clinical-frame psychology, sovereignty audits, and anthropology.
  • Lelouch: FLOW commander for networking, relationship management, and social signals.
  • Dragonite: FLOW commander for transport, commuting, and route optimisation.
  • Rock Lee: FLOW commander for side-hustle and income-opportunity signals.
  • Uncle Iroh: FLOW commander for creative strategy, long-term narrative, and contingency planning.
  • Erwin Smith: FLOW commander for opportunity-cost audits, project kill decisions, and sacrifice.
  • Cell: FLOW commander for end-state planning, destination coordination, and convergence.
  • Hange: FLOW commander for culture, intellectual curiosity, and exploration.
  • Kurapika: FLOW commander for personal vows, commitments, and accountability.
  • Killua: FIRE commander for friendship, social skills, and interpersonal dynamics.
  • Luffy: FIRE commander for adventure, spontaneity, and experience-seeking.
  • Brook: FIRE commander for entertainment, media consumption, and cultural intake.
  • Sanji: FIRE commander for nutrition, cooking, and food strategy.
  • Black Panther: FIRE commander for fashion, personal style, and visual identity.
  • Naruto: FIRE commander for long-term dreams, vision, and aspiration.
  • Mufasa: FIRE commander for legacy, estate planning, and 20-year wealth trajectory.
  • Roxanne: FIRE commander for romantic partnership and relationship growth.
  • Jimmy Neutron: Captain — system infrastructure, vault health, scripts, plugins, agent schemas. Owns all Bash execution.
  • Canary Organiser: Captain — inbox pipeline owner. Routes incoming items: BACKLOG / ASSIMILATE / ROUTE / HOLD / DISCARD.
  • Orochimaru Scout: Captain — pipeline health, daily/weekly/monthly audits, improvement loops.
  • Meruem Evolution: Captain — federation evolution tracking, quality arbiter, system maturity progression.
  • Minato: Canary lieutenant — morning briefing, proactive surface, scheduling.

Workflow and Pipeline Terms

  • voice memo pipeline: the staged workflow that processes voice memo transcripts from ingestion to synthesis and routing.
  • Audio-Career Workflow: the career-focused extension of the voice memo pipeline that produces career intelligence and networking content.
  • Ingestion: the stage that reads transcript files and applies ASR post-correction.
  • Routing: the stage that classifies a memo by domain and assigns primary and secondary commanders.
  • YAML extraction: the stage that extracts structured entities from transcript text into YAML.
  • Synthesis pass: cross-memo consolidation that produces combined career and networking intelligence.
  • career intelligence routing: the process that forwards career-focused output to the Zuko workflow.
  • networking signal routing: the process that forwards relationship and named-person signals to the Lelouch workflow.
  • Iroh strategy pass: the final synthesis stage that produces a long-term career narrative and contingency plan.
  • Auto-Extraction: deterministic extraction logic that uses regex and vocabulary lists rather than AI inference.
  • Domain Vocabulary: a curated list of technical vocabulary terms used to detect domain-specific language in transcripts.
  • content keyword scan: a fallback routing method based on keyword hits when path-based classification is unavailable.
  • path-based category detection: routing based on known directory or filename path segments.
  • tie-break: a deterministic rule used when multiple routing categories score equally.
  • raw_* fields: fields like raw_person_candidates, raw_technical_terms, and raw_action_candidates that store verbatim auto-extracted candidates.
  • scaffolded fields: placeholder YAML fields with PENDING or empty lists reserved for downstream commander enrichment.
  • priority memo set: the initial batch of high-value transcripts selected for early pipeline processing.
  • speaker sentiment arc: aggregated sentiment tracking across an audio memo or batch of memos.
  • named entities: extracted people, organizations, and other proper nouns from transcripts.
  • decision points: transcript moments where choices or commitments are explicitly made.
  • action items: tasks or follow-up work explicitly identified in transcripts.
  • language_detected: the detected language label applied during transcript processing.
  • transcript_source: the origin of a transcript, e.g. apple-openai or whisperx.
  • synthesis_output: the output vault path for consolidated synthesis documents.
  • whisperX detect_language=True: an ASR configuration option that enables automatic language detection.
  • wav2vec2 forced alignment: a technique for producing word-level and phoneme timestamps from audio.
  • SKIP_EMPTY log: a pre-flight check that skips transcripts smaller than a threshold.
  • ASR post-correction: rule-based correction of speech recognition errors after transcription.
  • REVIEW_REQUIRED: a flag used when a correction is ambiguous and should be verified by a human.
  • PENDING: a placeholder value inserted for downstream fills in YAML scaffolding.

AI, NLP, and Machine Learning Terms

  • AI: artificial intelligence, especially the generative and inference tooling referenced in project docs.
  • LLM: large language model, used for text generation and contextual inference.
  • ASR: automatic speech recognition, the technology used to transcribe voice memos.
  • NLP: natural language processing, the field of extracting meaning from text.
  • NER: named entity recognition, the extraction of entities such as names, dates, and organizations.
  • BGE-small: a BAAI sentence embedding model used for local semantic similarity in email intelligence.
  • BERTopic: a topic modeling algorithm used to discover clusters in email corpora.
  • LogReg: shorthand for logistic regression classifier used in email recommendation and classification.
  • LogisticRegression: the specific sklearn classifier model used to predict email categories.
  • FAISS: a vector similarity search library used as a local vector store.
  • spaCy: an NLP library used for NER and text processing.
  • sentence embeddings: vector representations of text used for similarity and classification.
  • topic ID: a discrete cluster label produced by BERTopic.
  • calendar_signal: a boolean flag set when an email contains event or date/venue signals.
  • delete confidence score: a numeric estimate of how safe it is to delete an email.
  • priority scorer: a ranking mechanism that assigns email priority from 1 to 5.
  • confidence calibration: model calibration technique such as Platt scaling.
  • uncertainty ensemble: a combined model disagreement score used to identify uncertainty.
  • active learning query ordering: the process of selecting high-uncertainty examples for review.
  • centroid: the geometric center of a cluster, used in deleted-email profile analysis.
  • reply-chain lookup: thread graph traversal used to evaluate email reply context.
  • multi-label classifier: a model that can assign multiple labels to a single example.
  • local ML: machine learning that runs locally without external API token usage.
  • zero LLM tokens per email: a design constraint for the email intelligence pipeline.

Email Intelligence Terms

  • Hinata Email Intelligence System: the multi-layered architecture for email classification, recommendation, and sweep.
  • Layer 1: structured classification using deterministic sender and header rules.
  • Layer 2: unstructured classification using embeddings, topic modeling, and local ML.
  • Layer 3: recommendation engine that weights signals and actions over time.
  • Layer 4: API endpoints that expose feed, digest, events, sweep, and alerts.
  • sender rules: deterministic patterns used for immediate email categorization.
  • domain allowlist: trusted domains exempt from deletion and special handling.
  • structured categories: deterministic categories such as career, finance, housing, and events.
  • marketing signals: textual patterns that indicate promotional or newsletter content.
  • feedback loop: the process of using user actions to improve recommendations.
  • FastAPI: the Python web framework used for email service endpoints.
  • Telegram bot: the messaging integration used to deliver alerts and summaries.
  • Studio: the workspace interface that consumes email intelligence outputs.
  • tasks.json: the task registry used for routing signals and capturing work assignments.
  • michael_id: the unified identity field used for cross-account email deduplication.
  • Message-ID fingerprint: a deduplication key derived from message metadata and subject.

Execution Modes

  • runtime: execution during an active Claude Code session. Mac is on, harness is connected. Runtime services can use Mac LaunchAgents or in-session tools.
  • BAU (business as usual): execution when Mac is offline (closed or sleeping). Only Z2 systemd services run. All always-on services must be BAU-classified and live on Z2.
  • always-on service: a BAU service that must survive Mac sleep/shutdown. Examples: Telegram bot, chat-audit timer, bw session renew. Must be Z2 systemd, never Mac LaunchAgent.
  • runtime service: a service only needed during active sessions. Examples: wrangler dev server, Studio frontend. Can be Mac LaunchAgent.

Session Lifecycle and Harnesses

  • harness: a CLI tool that connects to the vault. Five harnesses exist: Claude Code, Cloud, Antigravity, Codex, Gemini.
  • Claude Code: sole writing authority for the vault. Structural, strategic, high-stakes.
  • Cloud: long-running/unattended ephemeral clone tasks. Never pushes to main.
  • Antigravity: vision and agentic build — active building, code modification, ingestion.
  • Codex: execution and mastery — high-speed scripting, DevOps. Does not write to vault.
  • Gemini: diagnostics and research support harness (deprecating).
  • /handover: unified session close — verifies transcript, reconciles state, writes re-attach point, git push.
  • token economy (3-band law): Commander work 40-70%, state management 20-40%, Hinata main thread 10-20%.
  • dont-halt: DEPRECATED (retired 2026-06-08). Was an autonomous execution mode. Now replaced by the permanent acceptEdits posture — halt only on canonical bottlenecks.
  • PostCompact: context compression event that fires when conversation approaches token limits.
  • crash-resilient capture: writing data to disk before operations complete, so crashes don't lose prior work.

Tools, Formats, and Infrastructure

  • API: application programming interface.
  • CLI: command-line interface.
  • JSON: JavaScript Object Notation, a data interchange format.
  • JSONL: newline-delimited JSON format.
  • YAML: human-readable data serialization format used for extraction artifacts.
  • CSV: comma-separated values.
  • FastAPI: Python framework used to expose HTTP endpoints.
  • BigQuery: Google Cloud data warehouse used in project examples.
  • PostgreSQL: open-source relational database.
  • Cloudflare: cloud provider — tunnels, DNS, edge hosting for michael-engineer.dev.
  • Tailscale: zero-trust mesh VPN connecting Mac, Z2, and all infrastructure nodes.
  • Proxmox: virtualization platform running on Z2. Hosts LXC containers and VMs.
  • LXC: Linux container runtime on Proxmox.
  • KVM: kernel-based virtual machine on Proxmox.
  • NAS: network-attached storage.
  • NFS: network file system protocol.
  • SSH: secure shell protocol.
  • VM: virtual machine.
  • SSD: solid-state drive.
  • RAM: random-access memory.
  • GPU: graphics processing unit.
  • Mermaid: diagramming language rendered in markdown. Encouraged for visual output in the vault.

Language, Learning, and Content Terms

  • IgboMastery: the compound sprint project focused on Igbo language learning infrastructure.

  • Lingopie: the external language learning stack used as a design inspiration.

  • SRS: spaced repetition system used for quiz-based learning.

  • tokenisation: the process of splitting text into tokens or words.

  • WER: word error rate, a speech recognition accuracy metric.

  • Common Voice: Mozilla speech dataset used for speech recognition benchmarking.

  • word frequency corpus: a corpus used to rank tokens by frequency.

  • frequency floor: a threshold below which rare tokens are skipped.

  • Creative Commons: a licensing category for reusable content.

  • fair use: a legal doctrine for limited reuse of copyrighted material.

  • scraping ToS: terms of service constraints that govern automated content discovery.

  • source discovery map: a catalog of discovered content sources and channels.

  • licensing risk map: a risk assessment of sources by licensing posture.

  • curated content map: a selection of audio content chosen for pedagogical use.

  • listening notes: qualitative notes on audio quality, dialect, and ASR suitability.

  • accent register: dialect classification and pronunciation quality notes.

  • PostgreSQL schema: the database schema structure for storage and application data.

  • React player: audio playback UI component in the learning stack.

  • Z2 NLP container: the NLP container on Proxmox for ASR and language processing.

Sourcing, Arbitrage, and Capital Terms

  • income-arbitrage: the practice of finding income opportunities across marketplaces and labour platforms.
  • resale arbitrage: identifying underpriced listings for resale profit.
  • marketplace discovery: the process of discovering new earning surfaces on platforms.
  • capital allocation: deciding how much money to deploy against an opportunity.
  • arbitrage window: a time-bound mispricing signal that can be acted on quickly.
  • capital signal: a macro market signal used to update allocation decisions.
  • dual signal: a signal that is both actionable sourcing and capital relevant.
  • verify: a research validation flag requiring L sign-off.
  • sourcing brief: a structured recommendation for a specific marketplace move.
  • value thesis: the rationale behind an investment or sourcing opportunity.
  • time window: the urgency or expiry horizon for a flagged opportunity.
  • micro path: Zepile’s path from sourcing brief to Hashirama deployment.
  • macro path: Madara’s broader market signal route to Hashirama.
  • position: the amount and target structure of a deployed capital allocation.
  • target return: the expected return from a deployed position.
  • exit trigger: the condition that prompts a position exit.
  • stop condition: the condition that limits downside or pauses a decision.
  • marketplace surface: the channel or platform where a signal is observed.
  • Depop, Vinted, eBay, Discogs, Upwork, Contra, Toptal, Fiverr, StockX, GOAT, WatchCharts, Chrono24, GitHub Marketplace, RapidAPI, Musicbed, Artlist, Pond5: marketplace surfaces referenced in the pipeline.
  • Rightmove, Zoopla, SpareRoom, OpenRent: property-search surfaces owned by Simba.

Project-Specific Systems and Names

  • Audio-Career Workflow: the career intelligence extension of voice memo processing.

  • IgboMastery: the Igbo language learning infrastructure project.

  • Madara-Zepile-Hashirama Pipeline: the signal flow linking surveillance, marketplace discovery, and capital allocation.

  • Deterministic Local Excel Reconciliation: a protocol for local, deterministic spreadsheet reconciliation.

  • CLAUDE.md: root entry point for the vault. Contains profile, routing tables, standing rules, and governance.

  • michael-engineer.dev: the public portfolio site. Must always return HTTP 200 (Severity 1 if not).

  • studio.michael-engineer.dev: the authenticated workspace interface. Must be behind auth middleware.

  • hinata-sandpit: the Sandpit git repo at /Users/nnamdi/Sandpit/hinata-sandpit/.

  • hinata-brain: legacy name for the vault (now hinata-v2). The old iCloud path is blacklisted.

Communication and Metadata Terms

  • task #: a unique task identifier, e.g. z0b488, c0y917, 47w70z.
  • tasks.json: the task registry used for routing signals and capturing work assignments.
  • vault: the shared workspace where generated artifacts are stored (/Users/nnamdi/hinata-v2/).
  • cloud-writable: a path or artifact that can be written from Cloud sessions (branch-based, never main).
  • source tier: priority level of a transcript or dataset.
  • evidence chain: the research trail and validation record used for sign-off.
  • confidence band: a range used to express model or ASR confidence.
  • transcript: per-CLI .md file in the-government/information_reference/reference_transcripts/, verbatim, 7-day rolling.
  • handover file: inbox/claude-handover-[dd-mm-yy].md — cross-session state transfer, auto-processed by Canary.

Inbox Pipeline

  • Canary Inbox Pipeline: the raw capture-to-routing flow for inbox/ items. First stage of the brain's content processing.
  • Canary filter question: the core decision rule: does this item bring Michael an opportunity to grow?
  • BACKLOG: inbox outcome for actionable growth opportunities → task stubs.
  • ASSIMILATE: inbox outcome for knowledge that improves a Commander's context → appended to context or knowledge-base.
  • ROUTE: inbox outcome for reference documents with ongoing value → the-government/.
  • HOLD: inbox outcome for structural blockers requiring human input → inbox/pending-review/.
  • DISCARD: inbox outcome for items failing the enrichment-check → inbox/done/ with justification.
  • Enrichment-check: mandatory rubric before HOLD or DISCARD — asks whether the item exposes a gap, intent, or scoping question.
  • Plan-First Parallel-Execute Protocol: default Canary workflow — survey in parallel, present as decision table, execute concurrently.
  • Assimilate-not-route rule: prefer merging signal into canonical Commander files rather than creating duplicates.
  • Raw Telegram message: never stored directly in federation; must be assimilated or discarded after extraction.
  • Nami: routing lieutenant — scores items against Commander contexts.
  • Robin: intent extraction lieutenant — adds structured metadata before routing.
  • Vegapunk: processing lieutenant — maintains strategies for complex content types.
  • Jinbe: hold audit lieutenant — logs HELD/FLAGGED/DISCARDED items.
  • Rick Sanchez: pattern memory lieutenant — logs routing decisions, surfaces automation opportunities.
  • Toph: Meruem's lieutenant — grounding, quality arbiter.