The thesis
Computational hERG QSAR tools are widely used in pre-IND cardiac safety triage. They are also widely treated as black boxes, where users consult one tool, accept its verdict, and move on. The Comprehensive in vitro Proarrhythmia Assay (CiPA) initiative established a decade ago that static-descriptor IC50-only QSAR approaches have fundamental limitations when binding affinity dissociates from functional blockade. What the literature did not establish: whether different QSAR architectures fail in different directions on the same compound class, and whether that asymmetry is predictable from the scaffold structure plus the architecture's descriptor paradigm.
The v2 paper tests this on the iboga alkaloid family. Three architecturally divergent tools (LightGBM on ECFP4 fingerprints, Chemprop graph neural network, support vector machine on hand-crafted descriptors) are evaluated on canonical SMILES for eight compounds spanning natural products and rationally designed safer-scaffold analogs. The finding: the GNN and SVM systematically over-call Blocker on designed analogs (false-positives that would kill genuinely safer compounds in discovery pipelines). The fingerprint approach systematically under-calls Blocker on natural potent blockers including the active pharmaceutical ingredient of DMX-1001 (the first FDA-authorized US ibogaine clinical trial). Both failure modes trace to the same root cause, manifest in mirror-image directions.
The interesting claim isn't just "QSAR fails on iboga." It's that architecture-failure direction is predictable from architecture × scaffold class, which means computational triage cannot be replaced by experimental electrophysiology, but it also means specific architectures can be recommended for specific compound classes if the asymmetry is understood.