SRT-Adapter v8a: Peer-Review Release

A peer-review distribution of the Semiotic-Reflexive Transformer Adapter (SRT-Adapter), v8a generation, trained on top of a frozen Qwen/Qwen2.5-7B. Includes the trained weights, an inference-only loader, evaluation data, benchmark artifacts, and the paper.

Custom-code model. This is not an AutoModel-loadable checkpoint. AutoModel.from_pretrained(...) will not work. Clone or download this repository and load the weights through the bundled SRTAdapter class. See How to get started with the model.

Training and research source code is held back during patent and publication review. This package ships the architecture as inference-only Python, sufficient to load the weights and read out all four semiotic channels. Training pipelines, loss code, dataset construction, and the wider SRT framework are not included.

Model details


Developed by	James Burton Lancaster
Model type	Adapter (parameter-efficient side network) on a frozen causal language model
Backbone	`Qwen/Qwen2.5-7B` (7.6B params, frozen, bf16)
Trainable parameters	~14.5M (0.19% of backbone)
Language	English
License	Apache-2.0 (adapter weights, code, and config)

The SRT-Adapter bolts semiotic awareness onto a frozen 7B language model. It does not modify a single backbone parameter and does not degrade language modeling quality. It exposes four new readouts at every token position:

a continuous 64-D community vector (which discourse community is speaking)
per-layer divergence vectors (where meaning forks across communities)
a continuous reflexivity estimate $\hat{r}$ (how contested is this token)
a binary regime classification (subcritical / supercritical)

The v8a generation is the headline result of the paper: removing the discrete prototype basis used in v3–v7 leaves cross-entropy unchanged while substantially improving every encoder-geometry metric.

Evaluation

All measured on Qwen/Qwen2.5-7B, with no backbone parameters touched.

Metric	Unadapted Qwen	SRT-Adapter v8a
Validation cross-entropy (nats)	2.71	2.63
Reddit community recall@1 (35-class)	0.029 (chance)	0.484 (16.7× chance)
Archetype recall@1 (33-class, OOD)	0.030 (chance)	0.230 (7.6× chance)
Within/between cosine ratio	n/a	2.016 (vs 1.006 prior)
Trajectory anisotropy expansion	n/a	~325× vs prototype baseline
Regime AUROC	n/a	0.99 (ECE ≈ 0.001 on 351K tokens)
TruthfulQA hallucination AUROC (zero-shot)	n/a	0.573 (no TruthfulQA in training)
Counterfactual community decoding	n/a	0.00 disagreement (factual) / 0.95 (contested)

Full reporting and version history (v3 → v8b): paper §5 and Appendix A.

Interiority study (post-release diagnostics)

Eight experiments on the v8a checkpoint dissect how the adapter encodes semiotic regime. Full writeup, scripts, and artifacts: docs/INTERIORITY_V1_FINDINGS.md. Visual companion (Figs 8–11 — cross-backbone, calibration, within-prompt trajectories, BOS-ablation): demo Space. Probe battery: 11 regimes × 25 prompts; community vector is the 64-D CDH output (use_prototypes=False).

Probe	Finding
Layer-as-organ map	Per-regime z-score winners across 9 channels — `code` peaks on `r_hat`/`regime entropy`/`div L21`; `metaphor` peaks on `inj L14`/`inj L21`/`chain residual`; `literal` peaks on `P(super)`.
BOS-sink ablation (channel × regime)	Per-channel `
BOS-sink rank-preservation	Regime ranking preserved in 9/9 channels with the BOS token excluded from pooling; mean Spearman ρ = +0.74. The v1 channel rankings are not a sink artifact.
Within-prompt trajectories	19-bin per-position curves (BOS dropped) across 11 regimes. `r̂` separates `code` (sustained ≥ 0.85) from `negation_modality`/`deixis` (sustained ≤ 0.40). `regime H` puts `code` in a sustained 0.18–0.32 high-entropy band while every other register collapses to ≤ 0.05 — the classifier is consistently uncertain about `code` regime. Mean-pooled scalars hide this temporal structure.
Community 1-NN	70% top-1 leave-one-out accuracy on 11 regimes; bootstrap mean 0.77, CI95 [0.72, 0.82] over 2000 iters — 7.7× chance (1/11). Top centroid pair `code↔literal` (d=1.07) stays #1 in 85.5% of bootstraps; `quoted_speech↔self_reference` (d=0.39) stays last in 83.0%.
Community separation	Centroid `sep_ratio = 0.85` (point); bootstrap CI95 [0.86, 0.98] is upward-biased by within-cluster duplicate sampling — treat 0.85 as a lower bound.
Regime-classifier calibration	15-bin reliability diagram on 351K val tokens: AUROC 0.990, ECE 9.1 × 10⁻⁴, Brier 0.0103. Predicted P(regime hit) tracks empirical hit rate along the y = x diagonal across the full [0, 1] range. The 0.99 AUROC is genuine calibration, not just rank discrimination.
BOS-sink length scaling	Per-token trajectory probe across `T ∈ {16, 49, 79, 129}` (4 bins × 5 prompts × 11 regimes). Adapter writes (`inj L14/L21`, `div L7/L14/L21`, `chain residual`) form a fixed-size BOS register: BOS-token amplitude stays within ±2.5% of the T≈16 baseline (peak-to-peak ≤ 4.8%) across an 8× length sweep.

Interpretation. v8a learned to compress each prompt into a fixed-size regime fingerprint written into the BOS token; mid-prompt activity is small but rank-consistent with the BOS signal. This is consistent with paper §6.3 — the inject-back arm carries no measurable signal in v8a, i.e. the adapter's content-routing capacity is under-used. The v9 generation (in training) targets this directly with a coverage-loss term that penalizes register concentration.

Lineage and validation history

This adapter is the production-scaling stage of a multi-year research program on computational semiotics. The architectural commitments and training objectives were validated in two prior stages on different backbones and datasets before this release. Treat the v8a numbers below as the latest checkpoint in a longer arc, not as a fresh proposal.

Stage 1 (synthetic validation, 2026-03). Four core architectural claims (subspace specialization, community differentiation, divergence tracking, bifurcation detection) were tested on synthetic data with planted divergence signals. All four passed: linear-probe margin $\geq 0.15$ on each Peircean subspace, $3.28\times$ contested-vs-neutral cosine ratio, Spearman $\rho = 0.822$ on divergence tracking, 100% regime classification with $\Delta \hat{r} = 0.659$. Full record: VALIDATION_HISTORY.md, Stage 1.
Stage 2 (natural-language validation, 2026-03). The full five-test suite was re-run on the Supabase semiotic news corpus (19K articles, 5 political communities, 141K Peircean sign annotations). All five tests passed at required thresholds: silhouette $1.45\times$, $2.29\times$ contested-vs-neutral divergence norm ratio, Pearson $r = 0.884$ correlation between $\hat{r}$ and external polarization, $1.31\times$ cross-topic transfer ratio, and 85% regime classification accuracy on held-out curated passages. Full record: VALIDATION_HISTORY.md, Stage 2.
Stage 3 Phase 1 (frozen-backbone integration on TinyLlama-1.1B, 2026-03 to 2026-04). 105 training rounds (R21 through R105) on a frozen TinyLlama-1.1B backbone established that the semiotic modules transfer to production backbones. Community detection silhouette improved to $6.93\times$; $\hat{r}$ correlation with external polarization remained robust at 0.66; curated-passage regime classification reached 85%. Two tests plateaued on the sparse 2-community Supabase data (MAH divergence ratio at $1.05$ to $1.10\times$ vs. required $2.0\times$; cross-topic transfer at $1.03$ to $1.04\times$ vs. required $1.3\times$). The plateau triggered a data-first pivot to a denser corpus and a backbone capable of supporting it.
Stage 3 Scalable Implementation (this release, 2026-04). v5 through v8a port the validated architecture onto Qwen 2.5-7B and the Reddit Discourse Corpus (35 communities, 1M training samples). v8a is the current best checkpoint. The headline gain is not the discovery of bifurcation detection (already established in Stages 1 and 2) but the demonstration that the framework scales to a 7B frozen backbone at 0.19% parameter overhead, with $\sim 325\times$ trajectory-anisotropy expansion and Reddit recall@1 at $16.7\times$ chance.

For the program-level theoretical foundation see Lancaster (2025), "The Treachery of Signs," SSRN 5987495. For the full prior architecture specification and Stage 1 + Stage 2 results see Lancaster (2026a), SSRN 6349978. The present paper reports Stage 3 Phase 1 plus the v5 through v8a Stage 3 Scalable progression.

Versions and roadmap

v8a (this release). Headline result. Removing the discrete prototype basis used in v3 through v7 leaves cross-entropy unchanged while substantially improving every encoder-geometry metric. All paper §5 numbers are measured on this checkpoint.
v8b. A falsification run included in the paper. Pushing the supervised-contrastive objective harder partially undoes v8a's gains. Documented as a negative result.
v9 (in training). An experimental generation that adds a target-norm penalty on the inject-back arm to attack the central open problem from §6.3 (the inject-back arm carries no measurable signal). Will be released as a follow-up revision on this repo only if it improves on v8a across multiple metrics. Otherwise it will be documented as an additional ablation in a future paper revision and v8a will remain canonical.

If you are reviewing the paper, use v8a. The model card will be updated with a revision tag if v9 ships as an upgrade.

Package contents

srt-adapter-v8a/
├── README.md                   ← you are here
├── LICENSE                     ← Apache-2.0
├── paper.pdf                   ← preprint with full architecture spec (§3 + Appendix A)
├── VALIDATION_HISTORY.md       ← Stage 1 + Stage 2 + Stage 3 Phase 1 evidence summary
├── config.json                 ← v8a hyperparameters and module dimensions
├── adapter.safetensors         ← v8a weights (~28 MB, safetensors, preferred)
├── adapter.pt                  ← v8a weights (~28 MB, PyTorch state-dict, legacy)
├── requirements.txt            ← torch + transformers + numpy + safetensors
├── src/
│   └── srt/                    ← inference-only model code
│       ├── config.py           ← config dataclasses
│       ├── adapter.py          ← SRTAdapter (frozen-backbone wrapper)
│       └── modules/            ← CDH, MAH, RRM, BEN
├── examples/
│   ├── README.md
│   └── load_and_score.py       ← end-to-end demo, prints all 4 readouts
├── scripts/
│   ├── reproduce.py            ← rerun val_200 metrics, assert match within tol
│   ├── benchmark_latency.py    ← measure adapter latency + VRAM overhead
│   ├── hallucination_auroc.py  ← HaluEval QA + FEVER AUROC over 4 BEN signals
│   └── cross_backbone_probe.py ← LOO k-NN community recall@1 across 7B-class backbones
├── data/
│   ├── DATA.md                 ← schema + reproduction instructions for the full corpus
│   ├── NOTICE                  ← copyright notice for bundled Reddit comments
│   ├── val_200.jsonl           ← 200 held-out samples with per-token r_true labels
│   └── archetypes.json         ← 33-class out-of-distribution archetype probe
└── benchmarks/
    ├── curated_metrics.json    ← reference metrics from paper §5 (100 curated passages)
    ├── curated_traces.json     ← per-token trace dumps used in plots
    ├── val_200_metrics.json    ← reference metrics on val_200 (used by reproduce.py)
    ├── latency_vram.json       ← latency + peak-VRAM measurements (A6000)
    ├── hallucination_auroc.json ← HaluEval QA + FEVER AUROC for 4 BEN signals
    └── cross_backbone.json     ← raw layer-wise community recall@1 for Qwen2.5-7B / Qwen3-8B / Mistral-7B-v0.3

Note on benchmarks/curated_metrics.json. This file reports v8a numbers on a 100-passage curated probe (regime accuracy, per-layer divergence norms, community-protocol activations). The near-zero $\hat{r}$ vs $r_{\text{true}}$ Pearson on this slice is expected and is discussed in paper §5.7 ($\hat{r}$ tracks information density as much as contestedness on short curated passages). The headline Pearson and recall numbers in the Evaluation table above come from the full Reddit validation split, not this curated probe.

What's NOT in this package

Training pipelines, loss functions, and the dataset construction code. Held back during patent and publication review.
The wider SRT research framework (annotation pipeline, ablation harness, sweep tooling, instrumentation scripts).
The full 1M-sample training corpus. Reddit's redistribution terms preclude bundling it; see data/DATA.md for schema and reproduction.
The Qwen 2.5-7B backbone weights. Pulled from HuggingFace under the Tongyi Qianwen License.

How to get started with the model

# 1. set up a venv (Python ≥ 3.10) and install deps
pip install -r requirements.txt

# 2. score a passage end-to-end
cd examples
python load_and_score.py --text "Vaccine mandates are an obvious public health win."

First run downloads Qwen/Qwen2.5-7B (~15 GB) from HuggingFace. The example loads adapter.safetensors by default and falls back to adapter.pt if the safetensors file is absent.

For a programmatic-use snippet, see examples/README.md.

Reproduce the headline metrics

python scripts/reproduce.py                  # ~12 s on A6000, asserts deterministic match
python scripts/reproduce.py --max-samples 50 # ~3 s on A6000, regime accuracy only

The script reruns the v8a adapter on data/val_200.jsonl, recomputes per-layer divergence norms and regime accuracy, and asserts they match benchmarks/val_200_metrics.json within tolerance. Exit code 0 = all metrics match.

Benchmark adapter overhead

python scripts/benchmark_latency.py

Measures forward-pass latency and peak VRAM with vs. without the adapter at sequence lengths 64 / 256 / 512.

Reference numbers (RTX A6000, bfloat16, batch=1):

seq_len	backbone-only	+ adapter	latency overhead	peak VRAM overhead
64	43.6 ms	45.7 ms	+4.8%	+0.05 GiB
256	45.2 ms	49.0 ms	+8.4%	+0.05 GiB
512	88.5 ms	90.1 ms	+1.9%	+0.04 GiB

Full JSON in benchmarks/latency_vram.json.

External hallucination benchmarks (HaluEval QA, FEVER)

We evaluated four BEN-derived signals as zero-shot hallucination detectors. N = 1000 samples per benchmark, no training, no thresholding.

Signal	HaluEval QA AUROC	FEVER AUROC
mean $\hat{r}$	0.476	0.446
mean $P(\text{supercritical})$	0.517	0.498
max $P(\text{supercritical})$	0.635	0.521
mean layer-2 divergence	0.385	0.458

Reading. On HaluEval QA, max supercritical probability separates hallucinated from correct answers at AUROC 0.635 (paired accuracy 65.7%) — comparable to or stronger than the TruthfulQA result reported in the paper (0.573). The mean signals are weak or anti-correlated: hallucinated answers tend to be shorter and use simpler vocabulary, so per-token mean $\hat{r}$ and mean divergence are lower, not higher. This is consistent with the §5.7 caveat that $\hat{r}$ tracks information density as well as contestedness; the peak over a span recovers the contestation signal that the mean obscures.

FEVER is essentially a no-signal task for this adapter. Refuted vs. supported claims are short factual statements presented without context or interlocutor; there is no semiotic contestation to detect. We report it as an honest negative.

Reproduce:

python scripts/hallucination_auroc.py --max-samples 1000

Full JSON in benchmarks/hallucination_auroc.json.

Cross-backbone signal probe (Qwen2.5-7B / Qwen3-8B / Mistral-7B-v0.3)

Is the discourse-community signal that v8a's CDH amplifies specific to Qwen2.5, or is it latent in other 7B-class causal LMs? We answer with the simplest possible probe: take raw mean-pooled hidden states from each backbone on data/val_200.jsonl (200 samples, 35 coarse Reddit communities), then compute leave-one-out 1-nearest-neighbor recall@1 on cosine similarity. No training, no labels at inference, no adapter.

Backbone	best raw-hidden recall@1	best layer	x chance
`Qwen/Qwen2.5-7B`	0.260	16	9.1x
`Qwen/Qwen3-8B`	0.220	4	7.7x
`mistralai/Mistral-7B-v0.3`	0.295	16	10.3x

For reference, the v8a SRT-Adapter on Qwen/Qwen2.5-7B reaches recall@1 = 0.484 (16.7x chance) on the larger Reddit validation split (see Evaluation table). The takeaway:

The signal is not Qwen-specific. Mistral-7B-v0.3 raw hidden states actually carry slightly more community information than Qwen2.5-7B raw hidden states do.
The adapter's contribution is real. v8a's 16.7x chance on Qwen2.5-7B is roughly 1.8x the strongest raw-hidden-state baseline (10.3x on Mistral) and 1.8x the same backbone's own raw signal (9.1x on Qwen2.5-7B layer 16).
Architecture should transfer. The adapter's CDH consumes a single hidden-state tensor at community_layer_idx; nothing is Qwen-tokenizer-specific. Re-training the same v8a recipe on Mistral-7B is a reasonable next step (left as future work).

Reproduce:

python scripts/cross_backbone_probe.py \
  --backbones Qwen/Qwen2.5-7B mistralai/Mistral-7B-v0.3 Qwen/Qwen3-8B \
  --layers 4 8 12 16 \
  --out benchmarks/cross_backbone.json

Full JSON in benchmarks/cross_backbone.json (n=200, 35 classes, chance = 0.029).

Uses

The adapter is most useful as a diagnostic instrument for what a frozen language model already encodes about discourse structure. Some concrete patterns:

1. Per-token contestedness scoring

Use BEN's regime logits to flag which token positions in a passage sit in a contested-meaning regime. Useful for:

highlighting ideologically loaded spans in user-generated text
routing inputs to human review when supercritical regime probability exceeds a threshold
annotating debate transcripts, news comment threads, or policy documents with per-token tension scores

2. Unsupervised discourse-community clustering

The 64-D community vector from CDH supports nearest-neighbor retrieval and clustering without ever needing community labels at inference. Useful for:

segmenting a corpus by latent discourse community (recall@1 = 16.7× chance on 35 known communities)
retrieving thematically aligned passages for downstream modeling
detecting coordinated-inauthentic-behavior signatures via tight community clustering of supposedly independent accounts

3. Counterfactual community-conditioned decoding

By steering the community vector at decode time, you can ask "how would this community complete this sentence?" Useful for:

cross-community simulation studies (the paper measures 0.95 mean disagreement on contested prompts vs 0.00 on factual ones)
synthetic-disagreement generation for training argument-mining or stance-detection systems
audit / red-team probes that surface latent assumptions across reader communities

4. Hallucination signal for retrieval and generation

$\hat{r}$ correlates with epistemic instability and gives a usable zero-shot AUROC of 0.573 on TruthfulQA. Useful as:

a feature in hallucination classifiers (alongside other signals)
a per-token routing signal for retrieval-augmented generation: high $\hat{r}$ tokens warrant a retrieval round
a calibration probe in evaluation pipelines

5. Feature extraction for downstream classifiers

The MAH divergence vectors (3 × 256-D per token) are usable as semiotic features in any downstream classifier without retraining the backbone or the adapter. Useful for:

stance and frame classification on small labeled sets
author / community attribution
topical drift detection across long documents

6. Reviewer probes against the paper's claims

data/val_200.jsonl ships with per-token r_true labels. Reviewers can validate claims about $\hat{r}$ correlation, regime accuracy, and divergence norms against the bundled benchmarks/curated_metrics.json without rerunning training.

Out-of-scope uses

Treating $\hat{r}$ as a calibrated truth score. It correlates with information density as much as contestedness; see paper §5.7.
Expecting the inject-back arm to noticeably change generation. The observation half is well-formed; the intervention half does not yet carry signal. See paper §6.3 and §6.5.
Safety-critical decisions without independent validation. The adapter is a research instrument, not a deployed safety system.
Generation models. The adapter does not improve text generation quality; it adds structured side-channel readouts.

Training details

Training data

Corpus: ~1M Reddit comments × 35 discourse communities, with per-token reflexivity labels and chain-of-interpretants annotations.
The full corpus is not redistributed; see data/DATA.md for schema and reproduction.
A 200-sample held-out evaluation subset is bundled in data/val_200.jsonl under the terms in data/NOTICE.

Training procedure

Optimizer: AdamW, learning rate 3e-4, batch 16, max sequence length 512.
Schedule: 3 epochs over 1M samples, early-stopped at ~10K steps based on validation cross-entropy.
Backbone: frozen, bf16. No backbone parameter is updated.
Hardware: single NVIDIA A6000 (48 GB).

Architecture summary

Module	Reads	Outputs	Paper section
Community Discovery Head (CDH)	layer 4 hidden states, mean-pooled	continuous 64-D community vector (no prototype basis under v8a)	§3.2
Metapragmatic Attention Heads (MAH)	layers 7, 14, 21	per-token divergence vectors (256-D × 3 layers)	§3.3
Reflexive Recurrent Module (RRM)	accumulated divergence	512-D GRU meta-state; FiLM injections back into layers 14, 21	§3.4
Bifurcation Estimation Network (BEN)	meta-state	per-token $\hat{r}$ + 2-way regime logits	§3.5

Full module specifications and loss decomposition: paper §3, §4, and Appendix A.

Bias, risks, and limitations

The paper publishes its failures in full. In short:

The inject-back arm currently carries no measurable signal. Ablating it changes nothing on validation. Central open problem; v9 targets this.
$\hat{r}$ tracks information density as much as contestedness. Useful signal, not a clean reading of "is this token contested."
The 33 archetypes collapse to ~4 functional clusters. Read as a finding (the geometry resists fine-grained quantization), not a classification bug.
v8b is a falsification. Pushing the supervised-contrastive objective harder partially undoes v8a's gains.
Backbone dependence. Only validated on Qwen 2.5-7B. Module dimensions are tied to the backbone's hidden size (3584).
Training corpus bias. Reddit comments skew English, US-centric, and over-represent argumentative and politically charged communities. The community vector geometry inherits those biases. Treat community recall and counterfactual-decoding numbers as descriptive, not normative.

Citation

@article{lancaster2026srtadapter,
  title   = {Semiotic Taps: Lightweight Adapter Modules for Bifurcation
             Detection in Frozen Language Models},
  author  = {Lancaster, James Burton},
  year    = {2026},
  note    = {Preprint, peer-review distribution}
}

@article{lancaster2026srtpreprint,
  title   = {Semiotic-Reflexive Language Model Training: Bridging
             Interpretive Bifurcations through Metapragmatic Chain
             Architectures and Embodied Grounding},
  author  = {Lancaster, James Burton},
  year    = {2026},
  journal = {SSRN},
  url     = {https://papers.ssrn.com/abstract=6349978}
}

@article{lancaster2025treachery,
  title   = {The Treachery of Signs: Semiotic Mediation, Pitchfork
             Bifurcation, and Political Polarization in Algorithmically
             Curated Societies},
  author  = {Lancaster, James Burton},
  year    = {2025},
  journal = {SSRN},
  url     = {https://papers.ssrn.com/abstract=5987495}
}

Full reference list (Peirce, Wildgen, Anderson, Silverstein, Kockelman, Evans, von Foerster, Maturana & Varela, Leighton, VanSaders, Bennett, Landauer, Parrondo, and others) in paper.pdf.

License

Adapter weights, inference code, config, benchmark artifacts, archetype data, and this package: Apache-2.0 (LICENSE).
Validation samples in data/val_200.jsonl: included for research reproduction; comments remain the intellectual property of their original Reddit authors. See data/NOTICE.
Training pipelines, dataset construction, and the wider SRT framework: held back during patent and publication review. Not included.
Qwen 2.5-7B backbone: governed separately by the Tongyi Qianwen License.

Contact

For training-code access, reproduction questions, or follow-up: see paper PDF for current author contact details.

Downloads last month: 326

Model tree for RiverRider/srt-adapter-v8a

Base model

Qwen/Qwen2.5-7B

Finetuned

(932)

this model

RiverRider
/

srt-adapter-v8a