NMR Interpretation Guide

Use this guide when reviewing NMR evidence and deciding whether a SpectraCheck assignment is ready for report export.

Outline

Upload the raw FID archive and confirm instrument metadata.
Review 1H regions, integration, multiplicity, and solvent references.
Review 13C assignments, solvent peaks, and carbonyl regions.
Use COSY, HSQC, and HMBC evidence to validate connectivity.
Resolve contradiction flags before approving an interpretation.

Every numbered step needs a product screenshot before publication.

Worked example requirement

Use one accessible example, such as caffeine or ibuprofen, and run it through the current production workflow. Capture every state a reviewer sees:

Upload accepted with raw source file and parameter file visible.
Processed spectrum with regions labeled.
Peak table with shift, multiplicity, integration, assignment, and confidence.
Evidence card for at least one assigned peak.
Contradiction or warning state, even if the example uses a seeded issue.
Final accepted interpretation ready for export.

2D evidence in plain language

Experiment	What it helps validate
COSY	Which protons are coupled to nearby protons.
HSQC	Which proton is attached to which carbon.
HMBC	Longer-range proton-carbon connections used to support structure fragments.

Keep the explanation short enough for an analytical chemist who understands NMR but has not used MolTrace before.

Contradiction review

A contradiction is not a failure; it is a review queue. Show what evidence disagrees, which assignment is affected, and what action the scientist can take:

Accept with rationale.
Reassign the peak.
Mark as impurity, solvent, reference, or unknown.
Request additional evidence before export.

The NMR scientist owns this page and must verify the example, screenshots, and language before release.

Backend capabilities

The NMR interpretation backend has shipped a substantial set of analysis capabilities. This section catalogs what is in production today; the release timeline at the end gives chronological context.

Global Spectral Deconvolution (GSD) — opt-in analysis backend

The opt-in POST /spectrum/analyze/gsd endpoint runs industry-standard Global Spectral Deconvolution on a processed spectrum and returns peaks auto-classified as compound | solvent | impurity | artifact | 13C_satellite. It ships behind a per-request experimental: true flag while the soak loop runs, and graduates per-tenant or platform-wide on a measured verdict — see the GSD experimental rollout section in the deployment guide.

Detection algorithm — single-pass detection via scipy.signal.find_peaks; per-peak fitting via lmfit Lorentzian / pseudo-Voigt; level-aware overlap resolution at levels 4–5; classification using the Fulmer / Gottlieb residual-solvent tables. (v0.4.0, 2026-05-27)
Algorithm semantics + envelope unification — cluster_into_environments groups adjacent same-category peaks within a nucleus-aware J-coupling window into one chemical-environment entry. Legacy raw-FID surfaces (/nmr/raw-fid/preview and /nmr/raw-fid/process) gain environments / environment_count / environment_counts so the FE renders both detectors with one component. A vectorized _pseudo_voigt_sum plus analytical jacobian gives an 8.5× speedup on dense ¹³C (60000006_13c fixture: 5.5 min → 39 s), bit-exact-equivalent. (v0.5.0, 2026-05-27)
Strict promotion gate cleared — on the NMRShiftDB2 corpus the sidecar cleared its strict production promotion gate (95 % solvent auto-detect plus median compound-environment-count delta ≤ 2). The HMDB-style validation framework forward-models a noisy Lorentzian spectrum from a published peak list and gates against environment-count and multiplet-line deltas on a 20-fixture mini-corpus. Default ¹H clustering window widened 20 Hz → 30 Hz to accommodate strong-coupling AB systems and constrained-ring geminal H-H couplings up to 25–30 Hz. (v0.6.0, 2026-05-28)
Per-peak QC metrics for legacy raw-FID — LegacyEnrichedPeak.fit_redchi / fit_rmse / fwhm_ppm / signal_to_noise / baseline_noise_sigma — the same regulatory-tier QC quintuple already published by the GSD endpoint via Peak.metadata. Both /nmr/raw-fid/preview and /nmr/raw-fid/process populate the quintuple before returning. (v0.6.1, 2026-05-28)
Real HMDB validation corpus — a 100-fixture real-instrument HMDB corpus (60 × ¹H + 40 × ¹³C; Bruker 59 / Varian 41; solvent mix Water/D₂O 85, CD₃OD 6, CDCl₃ 5, DMSO-d₆ 4). Result: 95/100 parse cleanly; 53/57 = 93 % solvent auto-detect on the subset with a known solvent reference. The literal Prompt 3 spec is satisfied across NMRShiftDB2 (19 fixtures; 100 % solvent), HMDB synthetic (20), and HMDB real-instrument (100; 95 % parseable, 93 % solvent). (v0.6.2, 2026-05-28)

The multiplet capability groups GSD-resolved peaks into multiplets, recognises multiplicity (s / d / t / q / p / sext / sept / dd / dt / td / ddd / m), and recovers the underlying J couplings.

Multiplet detection plus synthetic overlay — POST /spectrum/analyze/multiplets takes a GSD peak list and returns recognised multiplets with recovered J couplings. The forward modeller generate_synthetic_multiplet is publicly exposed so the FE can overlay predicted-vs-observed peaks (light red) on the spectrum view. Algorithm: spatial cluster at 30 Hz → first-order Pascal-triangle match → dd analytical inversion / dt-td-ddd J-set enumeration with scipy.optimize.least_squares refinement → “m” fallback for unstructured clusters. Validation: 8 quinine multiplets resolved with J within 0.3 Hz of literature; a known hidden 11.4 Hz coupling benchmark recovered where standard peak picking misses it. (v0.7.0, 2026-05-28)
Multiplet J-coupling → unified confidence layer — the recovered J-couplings feed the unified candidate-confidence engine as the 40th evidence layer (multiplet_jcoupling). POST /candidates/compare/jcoupling returns per-candidate labels (strong | partial | weak | poor_j_agreement plus j_coupling_contradiction) so the FE can render a J-agreement badge per candidate. A contradiction (observed J above a threshold the candidate topology cannot produce) caps the score at 0.25. Purely additive: existing callers unchanged when no multiplet input is supplied. (v0.7.1, 2026-05-28)
Opt-in Karplus 3J refinement — Layer 40’s topological J-predictor gains an opt-in, conformer-averaged Karplus refinement for sp³ vicinal (³J) couplings (RDKit ETKDGv3 plus MMFF). When enabled (use_karplus=True), the flat 7.0 Hz aliphatic_vicinal placeholder is replaced by a geometry-aware estimate. Default-off and byte-for-byte identical when the flag is omitted. (v0.7.2, 2026-05-28)
Karplus validation corpus — an 8-molecule curated literature validation corpus (karplus_jcoupling_corpus_v1.json) and a pytest accuracy gate: mean absolute error 0.44 Hz (median 0.26, max 1.41), with clean separation between conformationally locked diaxial systems (mean 9.5 Hz, all ≥ 8.49 Hz) and mobile/averaged systems (mean 6.9 Hz, all ≤ 7.14 Hz) with no overlap. (v0.7.3, 2026-05-28)
Opt-in Haasnoot–Altona generalized Karplus plus honest negative result — a second selectable relation (karplus_method=haasnoot_altona). Per individual conformer it is more literature-faithful (recovers trans-decalin diaxial at 11.64 Hz, above the generic 10.26 Hz ceiling), but the corpus study — shipped as a regression gate — shows HLA does not improve averaged discrimination under the unweighted conformer model, openly documented. (v0.7.4, 2026-05-30)
Boltzmann conformer-population weighting (sugar blind-spot fix) — opt-in karplus_conformer_weighting field (uniform | boltzmann, default uniform) weights each conformer by its MMFF-energy Boltzmann population at 298.15 K. Measured corpus effect: β-D-galactose recovers from 8.49 → ~10.1 Hz onto its literature value; locked-vs-mobile separation widens (generic: +1.35 → +2.28 Hz). Once conformers are population-weighted, the generic relation discriminates better than HLA — the sugar gap was a weighting problem, not an equation one. (v0.7.5, 2026-05-30)
Karplus corpus scaled to 18 molecules — a new 18-molecule v2 corpus (9 locked diaxial plus 9 mobile/averaged, including five new pyranosides) graded across the {generic, haasnoot_altona} × {uniform, boltzmann} grid shows generic/boltzmann is the only one of the four that cleanly separates locked from mobile at scale. Within-tolerance 1.00, mean abs error 0.57 Hz, locked-vs-mobile separation +1.84 Hz. (v0.7.6, 2026-05-31)

Chemical-shift prediction

NMRNet wrapper plus HOSE-code fallback — predict_shifts(smiles, nuclei) returns predicted ¹H / ¹³C shifts (ppm) with per-atom uncertainty. Two backends: the NMRNet SE(3)-equivariant model (Xu et al., Nat. Comput. Sci. 5, 292, 2025) as an optional, lazily-loaded backend (in-process or remote GPU microservice), and a HOSE-code / NMRShiftDB2 topological fallback (spheres 6 → 1) as the default. NMRNet never fabricates a prediction — it activates only when configured. Exposed via POST /spectrum/predict/shifts. (v0.7.8, 2026-06-01)
NMRNet wrapper rework: local-first device strategy — reworked from microservice-first to local-first (Apple-Silicon dev): device resolution CUDA → MPS → CPU (CPU baseline, MPS best-effort with a clean CPU fallback), lazy torch, per-atom uncertainty from the conformer ensemble (std across n_conformers; null at n=1), Zenodo/HF-mirror weights acquisition (cached, SHA-256). HOSE fallback now requires ≥ 3 references per matched sphere. The QM9-NMR gate targets the paper’s QM9NMR MAE (0.020 / 0.262 ppm). NMRNet is never vendored. (v0.7.9, 2026-06-01)

Automated structure verification (ASV)

Multi-test ASV scorer — verify_structure(spectrum, proposed_smiles, prior_confidence=0.5, tests=None, options=None) scores how well a proposed structure explains an experimental 1-D NMR spectrum and combines several independent tests into one auditable posterior confidence. Four tests ship: PredictionBoundsTest, AssignmentsTest, HSQC2DRangesTest, MSMoleculeMatchTest, each returning a TestResult (score, significance, quality = score · tanh(significance/3), diagnostic). Bayesian log-odds combination (logit(p_post) = logit(prior) + Σ quality_i · ln10); verdict thresholds 0.80 (consistent) / 0.20 (inconsistent). Tests with no data abstain rather than fabricate evidence; a per-test error degrades to an abstain. Grounded in published ASV / CASE literature (Golotvin & Williams; Elyashberg et al.); no vendor scoring scheme is reproduced. (v0.8.0, 2026-06-03)

Spectrum retrieval — vector plus set similarity

FAISS HNSW similarity layer — moltrace.spectroscopy.similarity provides a Gaussian-smoothed 256-D spectral encoding [v_1H(128); v_13C(128)] with FAISS HNSW L2 retrieval, plus a Kuhn-Munkres set-similarity score (scipy.optimize.linear_sum_assignment; unmatched peaks allowed → robust to insertion/deletion). Performance: top-100 from 45 k in ≈ 2 ms (target was < 1 s). Implements the NMR-Solver methodology (Jin et al., arXiv:2509.00640, 2025) from the published equations. (v0.8.1, 2026-06-03)
POST /spectrum/retrieve endpoint — the similarity layer becomes a typed API. The endpoint matches a query spectrum (¹H/¹³C shift lists or a SMILES) against the server-configured FAISS index (MOLTRACE_SIMILARITY_INDEX) and returns the top-k nearest reference spectra by L2 distance. Graceful index_available=false when unset; one spectrum.retrieve audit event per call. (v0.8.2, 2026-06-03)

Release timeline

A chronological summary; see each subsection above for substantive detail.

Version	Date	Headline
v0.8.2	2026-06-03	`POST /spectrum/retrieve` endpoint (similarity retrieval contract)
v0.8.1	2026-06-03	FAISS HNSW spectrum retrieval (vector + set similarity)
v0.8.0	2026-06-03	Multi-test ASV verification scorer
v0.7.9	2026-06-01	NMRNet wrapper reworked (local-first, conformer-ensemble uncertainty)
v0.7.8	2026-06-01	NMRNet chemical-shift prediction wrapper + HOSE-code fallback
v0.7.6	2026-05-31	Karplus validation corpus scaled to 18 molecules
v0.7.5	2026-05-30	Boltzmann conformer-population weighting (sugar blind-spot fix)
v0.7.4	2026-05-30	Opt-in Haasnoot–Altona Karplus + honest negative result
v0.7.3	2026-05-28	Karplus vicinal-³J validation corpus + accuracy gate
v0.7.2	2026-05-28	Opt-in Karplus 3J refinement for Layer 40 vicinal couplings
v0.7.1	2026-05-28	Multiplet J-coupling → unified-confidence evidence layer
v0.7.0	2026-05-28	Multiplet analysis with GSD-enhanced J-coupling
v0.6.2	2026-05-28	100-fixture real-instrument HMDB corpus
v0.6.1	2026-05-28	Per-peak QC metrics + legacy parity
v0.6.0	2026-05-28	Validation framework + strict promotion gate cleared
v0.5.0	2026-05-27	Algorithm semantics + envelope unification
v0.4.0	2026-05-27	Prompt 3 GSD backend launch