About — reasoningtool

The problem

There are two fundamentally different operations in reasoning:

Exploration: What options exist? What haven't I considered? What's the full space of possibilities? This is divergent — it expands your view.
Testing: Is this actually true? What happens if it's wrong? What survives scrutiny? This is convergent — it narrows your view.

Most tools do one or the other. Brainstorming tools explore but don't test. Debate tools test but don't explore. Chain-of-thought does a bit of both but neither systematically. And none of them alternate — exploring, then testing what you found, then exploring the edge cases of what survived, then testing again.

The alternation matters because the two operations have structural blind spots that don't overlap. Exploration alone finds options but can't tell you which ones work. Testing alone validates what you're looking at but can't tell you what you're not looking at. Alternating covers both.

The core methods

ARAW (Assume Right / Assume Wrong)

Takes any claim and branches it: what follows if true? What follows if false? Then recurses — every conclusion is another claim. The result is a tree of tested claims stored in SQLite, with leverage scores, crux detection, and domain classification.

ARAW is operationalized Popperian falsification for everyday thinking. It forces symmetric treatment — whatever you do for "assume right," you must equally do for "assume wrong." This addresses confirmation bias not by knowing about it, but by having a process that forces the corrective operation.

UAUA (Universalize → ARAW → Universalize → ARAW)

Alternates between two mathematically distinct search operations:

Universalization operates on N-valued type theory. It asks "what is this an instance of?" and derives all instances — searching horizontally across the possibility space.

ARAW operates on binary Boolean logic. It asks "is this true or false?" and eliminates branches through contradiction — searching vertically into depth.

U1: Map the space (divergent, N-valued) → candidates
 ↓
A1: Test candidates (convergent, binary) → validated/rejected
 ↓
U2: Find edge cases of survivors (divergent) → new candidates
 ↓
A2: Final validation (convergent) → what survived all rounds

Each pass uses a fundamentally different logic — type enumeration vs. binary elimination — so their blind spots don't overlap. Information-theoretically, each ARAW pass maximizes entropy reduction (selecting the crux that most constrains remaining uncertainty), while each universalization pass maximizes entropy expansion (finding dimensions not yet explored).

207 structured skills

Each skill is a structured prompt that guides you through a specific type of thinking. Skills are classified by universality:

Universal skills apply whenever their trigger condition exists — their applicability is entailed by the problem structure. Heuristic skills work well in specific contexts, but whether they apply to your context requires judgment.

The skill tiers loosely reflect this. Tier 1 skills tend to be universal. Tier 4 skills tend to be context-dependent.

For the full philosophical treatment, see Universal Principles of Mathematical Problem Solving →

How it compares

	reasoningtool	AI agents	Prompt libraries	Chain-of-thought
Structured exploration	Yes (universalization)	No	No	Implicit
Systematic testing	Yes (ARAW)	No	No	Implicit
Alternation between both	Yes (UAUA)	No	No	No
Persistence across sessions	Yes (SQLite)	Varies	No	No
Interactive visualization	Yes (Sigma.js)	No	No	No
Cross-run synthesis	Yes	No	No	No
Classifies skills by universality	Yes (tiered)	No	Some	No
Philosophy of why it works	Yes (essays)	No	No	No

The difference isn't features — someone could add any of these to an existing framework. The difference is design intent. Agents execute tasks. Prompt libraries produce better outputs. This toolkit checks whether you're solving the right problem before you solve it.

What's proven and what's claimed

Proven (by the structure itself):

If you test both "assume right" and "assume wrong," you will consider alternatives you wouldn't have otherwise. This is logically necessary — the operation forces it.
If you alternate between exploration and testing, you cover failure modes that either operation alone misses. This follows from the fact that their blind spots are complementary.

Claimed (based on principles, not yet empirically validated):

The alternation produces better decisions than non-alternating methods.
The universal/heuristic skill classification helps users pick the right tools faster.
The full UAUA cycle (4 passes) is meaningfully better than a 2-pass version.

This is experimental. The philosophy is grounded in epistemology and information theory. The implementation is v1. If you find where it breaks, that's valuable — open an issue.

License

Apache 2.0. Use it, modify it, build on it.

Source on GitHub →