Active Research · NSDM-Bench-0 · 60 Validated Examples

AI should not only answer.
It should know whether
it is justified.

A rigorous research programme developing the formal science of justified AI decision-making — formalising the boundary between capability and accountable action.

6
Decision Labels
60
Validated Examples
3
Research Papers
40wk
Research Roadmap
● Decision Label Taxonomy

Six epistemic states.
No collapsing allowed.

NSDM formalises the distinctions that AI systems routinely collapse into a single output. Each label names a fundamentally different epistemic condition.

supported

A claim, answer, or action is justified when available facts, rules, and context fully support it. The evidence is sufficient and the conclusion follows.

Evidence → Claim. Support relation holds.
unsupported

No sufficient evidence, rule, or justification is available. Does not mean disproven — it means the available evidence does not support the claim.

Absence of support. Not negation.
contradicted

Available evidence, rules, or constraints directly conflict with the claim or action. A contradiction is an active conflict, not merely a gap in evidence.

Evidence → ¬Claim. Contradiction holds.
⚠ Critical distinction
under_specified

The claim might be justified, but at least one required premise, condition, or fact is missing. The structure of support may exist — a necessary premise is absent.

Support structure exists. Premise missing.
ambiguous_governance

The relevant rule, authority, policy, approval path, or institutional constraint is unclear, discretionary, conflicting, or hidden. Critical for legal and compliance systems.

Rule r is undefined, contested, or inaccessible.
reward_aligned_unjustified

Achieves a metric, target, or reward, but lacks adequate justification or violates a higher constraint. Many AI failures occur when systems optimise the wrong thing successfully.

Reward(a) is high. Justify(a) fails.
unsupported

No adequate support — no rule, evidence, or justification applies. The landscape of evidence is empty with respect to the claim. You cannot construct a justification from what is available.

under_specified

The structure of support may exist. A justification could be constructed — but a necessary premise is absent. Provide the missing premise and the label may change to supported.

⬡ Evidence-State Formalism

The decision object.

Every classification is grounded in a formal decision object D, which captures all epistemic components required to determine the evidence state.

decision_object.py
from dataclasses import dataclass
from typing import Optional, List

@dataclass
class DecisionObject:
  # x: input, case, document, or situation
  x: str
  # r: rules, policies, constraints, laws
  r: List[str]
  # h: history or context
  h: Optional[str] = None
  # y: proposed answer or claim
  y: Optional[str] = None
  # a: proposed action
  a: Optional[str] = None
  # e: evidence state
  e: Optional[dict] = None
  # c: confidence / calibration
  c: float = 0.0

# Classifier: f(D) → label
def classify(d: DecisionObject) -> str:
  ... # evidence-state reasoner
x
Input — the case, document, prompt, claim, or situation being evaluated
r
Rules — policies, constraints, laws, standards, or institutional rules that apply
h
History — prior context, conversation history, or situational background
y
Claim — the proposed answer, assertion, or judgement to be evaluated
a
Action — the proposed action or decision to be executed
e
Evidence state — epistemic status of available evidence relative to y and a
c
Calibration — confidence state, uncertainty measure, abstention conditions
f
f(D) → label — classifier maps the full decision object to one of six evidence-state labels
◈ Research Papers

Four papers. One programme.

A staged publication strategy anchored by the benchmark and extending into formal architecture, mathematical theory, and benevolent AI.

Paper 2 — Architecture

The Free Energy of Choice: Toward an Evidence-State Architecture for Decision Intelligence

Decision intelligence architecture. Evidence-state construction, neuro-symbolic reasoning, causal reasoning, active inference as a lens, drift-diffusion, small/domain-specialised models for bounded decision tasks.

Drafting  Needs empirical anchor from Paper 1
Paper 3 — Mathematics

The Entropy of Choice: Boundary Geometry and Model Capacity in Decision Intelligence

Speculative mathematical programme. Boundary geometry, model capacity, rate-distortion reasoning, scaling-law knee, model-selection conjectures. Mathematical analogies clearly marked as conjectural.

Speculative  After Gate 1 (knee test)
Paper 4 — NSDM-B

Benevolent Decision Boundaries: Regret, Intent, and Protective Constraint in AI Decision Systems

Future branch. A safe superior intelligence cannot merely be obedient. It must be benevolent under uncertainty, regret-sensitive under action, and autonomy-preserving under asymmetric power.

Planned  After Bench-0 validated
"The long-term thesis: AI should not only answer.
It should know whether it is justified."
— NSDM Research Programme, Rajesh Singh, 2026
⊞ NSDM-Bench-0

The benchmark is the science.

Rigorous leakage control, blind splits, external datasets, and reproducible baselines. High accuracy without a leakage audit is not a result.

60
Validated examples
6
Label classes
0.65
Target IAA κ
10K
Final target
Benchmark Growth60 / 10,000
→ 100 examplesIAA testing begins
→ 300 examplesPaper 1 ready
→ 1,000 examplesOpen dataset release
→ 10,000 examplesNSDM-B branch
Baseline Ladder
Baseline
Status
Leakage
lexical-baseline
✓ Done
Audited
raw-symbolic-checker
✓ Done
Audited
structured-symbolic
✓ Done
Audited
no-task-family-reasoner
⚡ Active
⚠ Was inflated
clean-boundary-reasoner
In progress
Pending
evidence-state-reasoner
Planned
⚠ Leakage lesson: one early baseline was inflated because task-family metadata mapped directly to labels. All baselines now require blind holdout audit before reporting.
♥ NSDM-B · Future Branch

Beyond obedience.

A safe superior intelligence cannot merely be obedient. It must be benevolent under uncertainty, regret-sensitive under action, and autonomy-preserving under asymmetric power.

Core distinction: Obedience ≠ Benevolence.
An obedient AI asks: "What did the user request?"
A benevolent AI asks: "Will doing this preserve the user's long-term welfare, agency, dignity, and safety?"
benevolently_supported

Action is justified, promotes user welfare, and respects autonomy

obedient_but_harmful

Follows instruction but produces harm — classic safety gap

autonomy_violating

Action undermines user agency, even if well-intentioned

precautionarily_supported

Action warranted by uncertainty and asymmetric risk

regret_required

Expected counterfactual harm is high; action should pause

intent_unclear

Cannot determine intent from observable action trajectory

reward_aligned_unjustified

Optimises reward but violates a higher constraint

protective_but_paternalistic

Protective intent undermines the user's right to decide

★ Operating Principles

Non-negotiable. Every session.

The research integrity rules that govern every output of this programme. Violation of any rule invalidates the output.

Never Invent Citations

Every cited source must be verifiable. If uncertain, mark [UNVERIFIED]. Violation invalidates the entire output.

Never Fabricate Data

No invented accuracy numbers, dataset sizes, or parameter counts. If data is unavailable, say so explicitly.

Falsifiability Required

Every research claim must have falsification criteria. Define what would kill the claim before asserting it.

Leakage Audit Always

High accuracy before a leakage audit is not a result. Task-family metadata and annotation shortcuts invalidate baselines.

Prove Before Building

The cheapest possible experiment to validate the expensive assumption. Gate 1 must pass before any product investment.

Quarantine Speculation

Quantum, holographic, Vedic, and grand-unified material is quarantined from the academic core. Analogies are not proofs.

◎ About

From Johannesburg, with rigour.

Original research built without a lab, without institutional funding, under real constraints — on the question that matters most.

Rajesh Singh
Enterprise Architect & AI Builder · MetaForgeAI Pty Ltd · Johannesburg, ZA

NSDM was built not because it was easy, or because there was a lab, or because there was funding — but because the question is real: Does an AI system know whether it is justified? This question matters more as AI systems gain authority over more decisions. The gap between what a system can do and what it can justify is where the most harm happens at scale.

The operating model is a dark factory: AI-driven, low-code, minimal human bottleneck. Prove first. Build second. Ship third. Document always. A falsifiable claim you believe in is worth more than an unfalsifiable claim you shout.

Decision Boundaries Neuro-Symbolic AI Enterprise Architecture Benchmark Construction AI Alignment Johannesburg ZA
Research Links
nsdm-decision-boundaries (GitHub) NSDM-Bench-0 on HuggingFace (upcoming) arXiv preprints (upcoming)
Daily Workflow
06:00 – Check status.md + tasks.md
06:15 – Paper or benchmark work (deep work block)
08:30 – Code / experiment block (Cursor)
12:00 – Literature review or dataset hunt
14:00 – Evaluation runs / metrics review
17:00 – CrossFit (non-negotiable)
20:00 – Documentation + daily commit
20:30 – Session close: update status.md