ChemistryResearch . Paper I live on ChemRxiv + OSF . Paper II v12 final, OSF x29fv live, ChemRxiv submission queued

hERG QSAR Research Program

A research program that tests whether different computational chemistry architectures fail predictably on out-of-distribution compound classes. The working hypothesis: descriptor paradigm and scaffold class jointly determine the direction of a QSAR tool's failure, not just the magnitude.

The thesis

Computational hERG QSAR tools are widely used in pre-IND cardiac safety triage. They are also widely treated as black boxes, where users consult one tool, accept its verdict, and move on. The Comprehensive in vitro Proarrhythmia Assay (CiPA) initiative established a decade ago that static-descriptor IC50-only QSAR approaches have fundamental limitations when binding affinity dissociates from functional blockade. What the literature did not establish: whether different QSAR architectures fail in different directions on the same compound class, and whether that asymmetry is predictable from the scaffold structure plus the architecture's descriptor paradigm.

The v2 paper tests this on the iboga alkaloid family. Three architecturally divergent tools (LightGBM on ECFP4 fingerprints, Chemprop graph neural network, support vector machine on hand-crafted descriptors) are evaluated on canonical SMILES for eight compounds spanning natural products and rationally designed safer-scaffold analogs. The finding: the GNN and SVM systematically over-call Blocker on designed analogs (false-positives that would kill genuinely safer compounds in discovery pipelines). The fingerprint approach systematically under-calls Blocker on natural potent blockers including the active pharmaceutical ingredient of DMX-1001 (the first FDA-authorized US ibogaine clinical trial). Both failure modes trace to the same root cause, manifest in mirror-image directions.

The interesting claim isn't just "QSAR fails on iboga." It's that architecture-failure direction is predictable from architecture × scaffold class, which means computational triage cannot be replaced by experimental electrophysiology, but it also means specific architectures can be recommended for specific compound classes if the asymmetry is understood.

The methodology stack

Pre-commit + falsification + audit-trail discipline imported wholesale from the Substrate Geometry research program. Predictions are committed before reruns. Honest priors are logged with SHA-256 anchors. Falsification dispositions specify in advance what experimental outcome would falsify each claim, with pre-committed reframe paths so reframings cannot drift into confirmation bias post-hoc. Every phase of the revision arc is hash-anchored in a JSON audit trail that chains back to the scoping document SHA-256.

The methodology produced empirical dividends. v1 of the paper made a three-model architectural-invariance claim that turned out to rest on a structurally wrong 18-MC SMILES (methoxy on the wrong carbon). The canonical-SMILES rerun in Phase 1 falsified the v1 claim and surfaced the architecture-specific asymmetric pattern that became the v2 paper's central finding. The pre-commit discipline made the falsification visible rather than letting it disappear into a quiet revision.

The Pred-hERG noribogaine false-negative is a white-box mechanistic finding: the binary classifier correctly calls Blocker at 69.7% confidence; the deployed consensus rule combines this with multiclass and regression sub-models to output Non-blocker as the user-facing verdict. The failure is in the consensus aggregation, not in any single sub-model. This kind of white-box localization is rare in QSAR critique, and it gives the Pred-hERG developers concrete information about their own deployed service.

Current state

The v2 manuscript is in moderation on ChemRxiv. The OSF pre-registration is live (DOI 10.17605/OSF.IO/UWVX4) with falsification dispositions and SHA-256 hash anchors for the two iboga compounds without published experimental data (coronaridine and oxa-noribogaine). All eight pre-registered honest priors landed within their registered probability ranges when the models were rerun on canonical SMILES, which is the empirical validation that the pre-commit discipline is doing what it is supposed to do.

The revision arc went through 9 numbered phases plus a verification pass, integrating 24 items from two rounds of cross-AI adversarial review. Every substantive change has an audit-trail entry with rationale. The full revision_audit_trail.json is published as supplementary on OSF.

Next paper in the program: a pre-registered multi-family extension testing whether the asymmetric-failure pattern generalizes from iboga to other psychedelic-class scaffolds (tryptamines, phenethylamines, lysergamides, cathinones, ergolines). All families share the structural property of being underrepresented in mainstream pharmaceutical ChEMBL training data, so the architecture-failure direction should be predictable per family per architecture.

What this substrate teaches

Chemistry is a particularly clean test of the methodology because the medium is the actual chemical structure rather than a simulation. You cannot tell an iboga alkaloid to bind hERG kinetically differently than its scaffold dictates. The QSAR tools build specifications around expected pharmaceutical chemistry, then fail predictably when the medium presents psychedelic-class scaffolds that violate the implicit assumptions of the training distribution.

The lesson that transfers back to the other Deep Synthesis projects: every tool implicitly assumes the medium it was trained against. When the actual medium differs structurally, the tool's failure direction is a property of the descriptor paradigm, not the medium's fault. The disciplined move is to make that assumption visible (pre-commit + falsification + audit-trail surfaces it explicitly) and let the medium arbitrate, rather than insisting the medium conform to the tool.

Open invitations

The chemistry paper repository is public on OSF (project tnpqv, registration uwvx4) under CC0 1.0 license. The compound library with canonical SMILES + per-compound audit trail, model prediction JSONs, ECFP4 + Murcko similarity analyses, and the full revision audit trail are all included.

The pre-registered predictions for coronaridine and oxa-noribogaine are falsifiable by anyone who commissions experimental electrophysiology on either compound. When that data eventually exists, the per-prediction outcome can be added to the cumulative evidence on which architectural approach is reliable for which regime of the iboga compound class. Replication of any model run is welcomed and acknowledged.

Coauthorship is on the table for the cross-family generalization paper if a collaborator can contribute: a tryptamine / phenethylamine / lysergamide / cathinone family compound library with experimental hERG data, OR access to a fourth QSAR architecture not in the current paper (CardioTox, DeepHIT in a reproducible environment, or comparable), OR clinical patch-clamp expertise willing to commission electrophysiology on the pre-registered compounds.

Citation: Couey, V.W. (2026). Architecture-Specific Failure Modes in hERG QSAR Predictions for Iboga Alkaloids. ChemRxiv preprint (link forthcoming once moderation completes). DOI placeholder. Related OSF pre-registration: 10.17605/OSF.IO/UWVX4.

Papers

The program in print

Paper I (iboga, within-family architecture-failure) is live on ChemRxiv with the full audit trail mirrored on OSF. Paper II (cross-family data-landscape) is the multi-family extension: v12 markdown final, OSF pre-registration x29fv live and admin-cleared since 2026-05-21, ChemRxiv submission gated on the render-layer styling pass before Gate G. Both audit trails chain back to scoping SHAs published at OSF project tnpqv.

PreprintChemRxiv · 2026DOI:10.17605/OSF.IO/UWVX4

Architecture-Specific Failure of hERG QSAR Models on Iboga Alkaloids: An Applicability-Domain Analysis Across Three Production Models

Three production hERG QSAR models (Pred-hERG LightGBM/ECFP4, ADMET-AI GNN, admetSAR SVM) are stress-tested on an iboga-alkaloid evaluation set with a pre-registered applicability-domain protocol. Two architecture-specific failure modes emerge: GNN + SVM over-call Blocker on designed safer-scaffold analogs (18-MC, tabernanthalog); LightGBM/ECFP4 under-calls Blocker on natural potent blockers (voacangine, noribogaine = DMX-1001 API). Both failures trace to a shared root cause (binding-vs-blockade dissociation) expressing in opposite directions per architecture. The Pred-hERG noribogaine false-negative is shown to be a white-box consensus-rule failure: the binary sub-model correctly calls Blocker; the deployed consensus rule overrides it via the multiclass + regression sub-models. 8 of 8 pre-registered priors landed within their registered probability ranges on the canonical-SMILES rerun.

OSF

PreprintOSF preprint . ChemRxiv submission in moderation (2026-05-24) · 2026DOI:10.17605/OSF.IO/X29FV

Systematic Under-Characterization of Psychedelic Compound Classes in Published hERG QSAR Training Data: A Pre-Registered Multi-Family Analysis

Across three structurally independent psychedelic compound families (22 tryptamines, 20 phenethylamines, 17 cathinones; 59 compounds spanning natural products to FDA-approved medications), the published primary hERG patch-clamp literature with same-preparation Ki + IC50 pairing yields one substantive datapoint (bupropion, 69 uM). Six of 11 pre-registered priors anticipating training data on marketed pharmaceuticals were falsified in a tightly clustered pattern. A pre-registered targeted retrieval protocol (PubMed bath-salt toxicology, Drugs@FDA bupropion NDA, ergoline boundary spot-check) returned zero additional pairs. Three-architecture cross-architecture positive-call divergence (CAPD) on all 59 compounds clears the pre-registered effect-size threshold on every family; cathinones show 10x spread between ADMET-AI Blocker rate (59%) and Pred-hERG consensus (6%). Two suggestive cross-family cases of the Paper-I consensus-rule override are observed (psilocin and a 4-OH-MET tryptamine analog). The scaffold-vs-bit-level hypothesis resolved PARTIAL CONFIRMATION on the scaffold-aware half. Methodologically the paper adapts PRISMA-S search-protocol-as-load-bearing-methodology to chemoinformatics QSAR-validity work and offers a worked data-landscape-primary template. The empirical finding implies pre-IND cardiac safety triage of psychedelic-class compounds operates against a data infrastructure that QSAR architectures cannot bridge from training to query distribution.