Preclinical Study Rescue: 3 Re-evaluation Strategies
For researchers facing false negatives in preclinical studies. Analyze failure causes and use 3 re-evaluation strategies: STAM, AI pathology, PRO-C3.
In drug development, the most regrettable thing is judging 'no effect' due to inadequate study systems rather than the drug's own effect (False Negative).
Especially in complex disease areas like NASH (MASH), IPF, and renal fibrosis, lack of reproducibility and unexpected negative data remain a continuing concern. The broader preclinical reproducibility problem has been documented from multiple angles: 89% (47/53) non-reproducibility in preclinical cancer findings[ref-begley], ~65% in Bayer's internal target-validation audit[ref-prinz], and an estimated US$28B/year of irreproducible US preclinical research spend[ref-freedman]. Sentiments like "Results do not come out as in the paper" or "No significant difference was found in another company's study" can be viewed as concrete manifestations of these structural challenges.
In many cases, re-verifying data reveals that the problem lies in the "Method" rather than the "Molecule". In this article, we explain perspectives and strategies to scientifically "re-evaluate" failed studies and rediscover the value of buried promising compounds.
1. Why Do Studies Fail? The Trap of "Model Selection Mistakes"
Many cases of "no reproducibility" have a common cause. It is the mismatch between the drug's Mechanism of Action (MoA) and the pathophysiology of the animal model.
Lack of Metabolic Background (MASH/NASH Example)
For example, are you evaluating a drug aiming for anti-inflammation/anti-fibrosis via improvement of insulin resistance using the MCD (Methionine Choline Deficient) diet model? The MCD model shows severe fibrosis in a short period but is accompanied by significant "weight loss" and "hypoglycemia." Because the obesity and insulin resistance underlying human NASH are largely absent, MCD is generally considered poorly suited for evaluating metabolic drugs such as GLP-1 receptor agonists or SGLT2 inhibitors, where false-negative outcomes have been frequently reported[ref-stam-nash][ref-thr-nash]. For MoAs acting through metabolic improvement, models that carry the obesity/metabolic background (e.g., STAM, GAN diet) are typically preferable.
Influence of Spontaneous Healing (IPF Example)
Are you troubled by variability in the control group in the Bleomycin pulmonary fibrosis model using young mice? Bleomycin-induced fibrosis is known to undergo time-dependent spontaneous resolution / remodeling, and young animals in particular can show measurable fibrosis improvement even without drug administration[ref-bleo-time][ref-bleo-young-old]. When this "spontaneous resolution" noise is large, an add-on therapeutic effect can be masked and statistical significance becomes harder to detect. The ATS workshop report on pulmonary fibrosis animal models also flags timing (prophylactic vs therapeutic), route, and age-dependence as critical design factors[ref-ats-bleo].
Comparison of Clinical Correlation
| Feature | MCD Diet Model (Traditional) | STAM™ Model (Metabolic background + fibrosis progression) | Re-evaluation Perspective |
|---|---|---|---|
| Metabolic Background | Weight loss, Hypoglycemia | STZ + HFD-derived diabetes / dyslipidemia (obesity is limited due to the STZ component)[ref-stam-nash] | MCD can be poorly suited for evaluating metabolic drugs; consider STAM or related models for MoAs requiring a metabolic background. |
| Fibrosis Progression | Rapid but artificial | Progresses over several weeks (reported timing varies by institution and strain) | Choose based on whether human-like progression is required by your MoA. |
| Clinical Correlation | Low (Similar pathology only) | Combined histology + metabolic readouts are feasible; head-to-head comparison data are limited and partly rely on in-house data from the providing institution | When prioritising clinical predictability, also request historical data disclosure from the CRO when selecting a model. |
2. Limits of "Analog Eyes": Insufficient Sensitivity of Evaluation Systems
Even if you choose an appropriate model, you cannot scoop up gold dust (efficacy) if the evaluation tool is a "colander."
Conventional semi-quantitative scoring (stage classification like 0-4) by pathologists is the standard histological method, but it benefits substantially from being complemented with quantitative image analysis and biomarker readouts; when used alone it has the following recognised limitations[ref-ai-pathology][ref-qfibrosis]:
- Inter-observer variability: Score changes depending on the evaluator or timing.
- Low sensitivity (Discontinuity): Even if fibrosis area decreases from 15% to 10%, it is judged as the same "Stage 3" on score, risking "No change."
Missing slight but certain drug efficacy. This is very often the hidden identity of negative data.
*Concept diagram: Bar graph (Bar) images score evaluation by humans (hard to see difference). Line graph (Line) images quantitative analysis by AI (detects difference sensitively). Not actual data.
For researchers tracking fibrosis & inflammation R&D
FDA approval alerts, trial readouts, preclinical model selection, and assay optimization — curated signal for bench-to-pipeline readers. 2 emails/month max.
3. 3 Re-evaluation Strategies: Picking Up Signals with High-Precision Technology
To rebuild a failed program, you need to review the pathology with a higher resolution lens.
Strategy A: Switching to Clinical Correlation Model (STAM™)
If MCD or similar models led to false-negative outcomes for a metabolic-background MoA, re-testing in the STAM™ model or GAN diet model is a reasonable option. These models have been reported to support fibrosis progression under a STZ + HFD metabolic background and may be more appropriately suited to evaluating metabolic-improver MoAs[ref-stam-nash][ref-thr-nash]. Individual variability and head-to-head comparability between models depend on institutional data and operational conditions, so request historical CV / Vehicle-stability data from the CRO when selecting a model.
Strategy B: Slide Re-evaluation by AI Pathology Analysis
If "trend was seen but no significant difference," AI image analysis of existing slides can be a useful re-analysis approach before launching a new animal study. Continuous-value quantification of fibrosis has been reported to capture subtle changes that semi-quantitative scoring may miss[ref-ai-pathology][ref-qfibrosis][ref-auto-fibrosis]. AI re-analysis is not guaranteed to yield statistical significance — outcomes depend on the source-data signal-to-noise, sample size, and analysis pipeline.
Strategy C: Utilization of High-Sensitivity Biomarkers
Fibrogenic "movement" may be changing before histology "appearance" changes. Dynamic biomarkers such as PRO-C3 (Type III collagen formation marker) may provide supportive / exploratory evidence of altered fibrogenesis at early time points where histology scores remain unchanged[ref-proc3-adapt][ref-proc3-collagen][ref-proc3-mre]. PRO-C3 should be positioned as a complementary biomarker to histology, not as standalone proof of antifibrotic activity.
4. Decision Tree for Re-evaluation
We organized what action to take according to the type of "failure" you are facing.
Conclusion: Negative Data is Not the "End"
The fact that "reproducibility was not obtained" is not a death sentence for the compound. It is often just one result that "could not be measured with that study system."
The important thing is to factorize why it failed from pathological and metabolic mechanisms, and rebuild with appropriate models and latest measurement technologies.
Further Reading
- Basics of Model Selection
- About AI Pathology Analysis
- Biomarker Utilization
Next Step: When reviewing your study design, please refer to the technical explanation pages above. Appropriate selection of evaluation systems is the first step in project regeneration.
References
Reproducibility Crisis
- Begley CG, Ellis LM. Drug development: Raise standards for preclinical cancer research. Nature. 2012;483(7391):531-533. PubMed PMID 22460880
- Prinz F, Schlange T, Asadullah K. Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov. 2011;10(9):712. PubMed PMID 21892149
- Freedman LP, Cockburn IM, Simcoe TS. The Economics of Reproducibility in Preclinical Research. PLoS Biol. 2015;13(6):e1002165. PubMed PMID 26057340
NASH / MASH Models
- Lee S, et al. The Role of the Histone Methyltransferase EZH2 in Liver Inflammation and Fibrosis in STAM NASH Mice. Biology (Basel). 2020;9(5):93. PubMed PMID 32370249 / DOI:
10.3390/biology9050093 - THR-β activation in mouse NASH/fibrosis model. Br J Pharmacol. 2021. PubMed PMID 33655500 / DOI:
10.1111/bph.15427 - Translational mouse model for NASH with advanced fibrosis and atherosclerosis. Cells. 2020;9(9):2014. PubMed PMID 32883049 / DOI:
10.3390/cells9092014
IPF / Bleomycin Models
- Bleomycin model as active IPF-like disease model. PLoS ONE. 2013. PubMed PMID 23565148 / DOI:
10.1371/journal.pone.0059348 - Klee S, et al. Transcriptomic and proteomic profiling of young and old mice in the bleomycin model reveals high similarity. Am J Physiol Lung Cell Mol Physiol. 2023;324(3):L245-L258. PubMed PMID 36625483 / DOI:
10.1152/ajplung.00253.2021 - Young mouse bleomycin model lipid/metabolite study. BMC Pulm Med. 2022. PubMed PMID 35509094 / DOI:
10.1186/s12890-022-01972-6 - Jenkins RG, Moore BB, Chambers RC, et al. An Official American Thoracic Society Workshop Report: Use of Animal Models for the Preclinical Assessment of Potential Therapies for Pulmonary Fibrosis. Am J Respir Cell Mol Biol. 2017;56(5):667-679. PubMed PMID 28459387 / DOI:
10.1165/rcmb.2017-0096ST
AI / Quantitative Pathology
- Abdurrachim D, et al. Utility of AI digital pathology as an aid for pathologists scoring fibrosis in MASH. J Hepatol. 2025;82(5):898-908. PubMed PMID 39612947 / DOI:
10.1016/j.jhep.2024.11.032 - qFibrosis quantitative fibrosis scoring. J Hepatol. 2014. PubMed PMID 24583249 / DOI:
10.1016/j.jhep.2014.02.015 - Automated fibrosis quantification in NAFLD. PubMed PMID 32531442
PRO-C3 / Biomarkers
- PRO-C3 / ADAPT in NASH screening (CENTAUR population). JHEP Rep. 2021. PubMed PMID 34454994 / DOI:
10.1016/j.jhepr.2021.100330 - PRO-C3 and collagen fragments in NASH. Dig Dis Sci. 2018. PubMed PMID 30120271 / DOI:
10.1007/s10620-018-5219-9 - PRO-C3 and MRE-assessed fibrosis in NAFLD. Hepatology. 2019. PubMed PMID 30859582 / DOI:
10.1002/hep.30455