Preclinical Study Rescue: 3 Re-evaluation Strategies

In drug development, the most regrettable thing is judging 'no effect' due to inadequate study systems rather than the drug's own effect (False Negative).

Especially in complex disease areas like NASH (MASH), IPF, and renal fibrosis, lack of reproducibility and unexpected negative data remain a continuing concern. The broader preclinical reproducibility problem has been documented from multiple angles: 89% (47/53) non-reproducibility in preclinical cancer findings^[ref-begley], ~65% in Bayer's internal target-validation audit^[ref-prinz], and an estimated US$28B/year of irreproducible US preclinical research spend^{[ref-freedman]}. Sentiments like "Results do not come out as in the paper" or "No significant difference was found in another company's study" can be viewed as concrete manifestations of these structural challenges.

In many cases, re-verifying data reveals that the problem lies in the "Method" rather than the "Molecule". In this article, we explain perspectives and strategies to scientifically "re-evaluate" failed studies and rediscover the value of buried promising compounds.

1. Why Do Studies Fail? The Trap of "Model Selection Mistakes"

Many cases of "no reproducibility" have a common cause. It is the mismatch between the drug's Mechanism of Action (MoA) and the pathophysiology of the animal model.

Lack of Metabolic Background (MASH/NASH Example)

For example, are you evaluating a drug aiming for anti-inflammation/anti-fibrosis via improvement of insulin resistance using the MCD (Methionine Choline Deficient) diet model? The MCD model shows severe fibrosis in a short period but is accompanied by significant "weight loss" and "hypoglycemia." Because the obesity and insulin resistance underlying human NASH are largely absent, MCD is generally considered poorly suited for evaluating metabolic drugs such as GLP-1 receptor agonists or SGLT2 inhibitors, where false-negative outcomes have been frequently reported^{[ref-stam-nash]}^{[ref-thr-nash]}. For MoAs acting through metabolic improvement, models that carry the obesity/metabolic background (e.g., STAM, GAN diet) are typically preferable.

Influence of Spontaneous Healing (IPF Example)

Are you troubled by variability in the control group in the Bleomycin pulmonary fibrosis model using young mice? Bleomycin-induced fibrosis is known to undergo time-dependent spontaneous resolution / remodeling, and young animals in particular can show measurable fibrosis improvement even without drug administration^{[ref-bleo-time]}^{[ref-bleo-young-old]}. When this "spontaneous resolution" noise is large, an add-on therapeutic effect can be masked and statistical significance becomes harder to detect. The ATS workshop report on pulmonary fibrosis animal models also flags timing (prophylactic vs therapeutic), route, and age-dependence as critical design factors^{[ref-ats-bleo]}.

Comparison of Clinical Correlation

Feature	MCD Diet Model (Traditional)	STAM™ Model (Metabolic background + fibrosis progression)	Re-evaluation Perspective
Metabolic Background	Weight loss, Hypoglycemia	STZ + HFD-derived diabetes / dyslipidemia (obesity is limited due to the STZ component)^{[ref-stam-nash]}	MCD can be poorly suited for evaluating metabolic drugs; consider STAM or related models for MoAs requiring a metabolic background.
Fibrosis Progression	Rapid but artificial	Progresses over several weeks (reported timing varies by institution and strain)	Choose based on whether human-like progression is required by your MoA.
Clinical Correlation	Low (Similar pathology only)	Combined histology + metabolic readouts are feasible; head-to-head comparison data are limited and partly rely on in-house data from the providing institution	When prioritising clinical predictability, also request historical data disclosure from the CRO when selecting a model.

2. Limits of "Analog Eyes": Insufficient Sensitivity of Evaluation Systems

Even if you choose an appropriate model, you cannot scoop up gold dust (efficacy) if the evaluation tool is a "colander."

Conventional semi-quantitative scoring (stage classification like 0-4) by pathologists is the standard histological method, but it benefits substantially from being complemented with quantitative image analysis and biomarker readouts; when used alone it has the following recognised limitations^{[ref-ai-pathology]}^{[ref-qfibrosis]}:

Inter-observer variability: Score changes depending on the evaluator or timing.
Low sensitivity (Discontinuity): Even if fibrosis area decreases from 15% to 10%, it is judged as the same "Stage 3" on score, risking "No change."

Missing slight but certain drug efficacy. This is very often the hidden identity of negative data.

*Concept diagram: Bar graph (Bar) images score evaluation by humans (hard to see difference). Line graph (Line) images quantitative analysis by AI (detects difference sensitively). Not actual data.

For researchers tracking fibrosis & inflammation R&D

FDA approval alerts, trial readouts, preclinical model selection, and assay optimization — curated signal for bench-to-pipeline readers. 2 emails/month max.

3. 3 Re-evaluation Strategies: Picking Up Signals with High-Precision Technology

To rebuild a failed program, you need to review the pathology with a higher resolution lens.

Strategy A: Switching to Clinical Correlation Model (STAM™)

If MCD or similar models led to false-negative outcomes for a metabolic-background MoA, re-testing in the STAM™ model or GAN diet model is a reasonable option. These models have been reported to support fibrosis progression under a STZ + HFD metabolic background and may be more appropriately suited to evaluating metabolic-improver MoAs^{[ref-stam-nash]}^{[ref-thr-nash]}. Individual variability and head-to-head comparability between models depend on institutional data and operational conditions, so request historical CV / Vehicle-stability data from the CRO when selecting a model.

MASH Model Selection Guide: GAN vs CDA vs STAM

Strategy B: Slide Re-evaluation by AI Pathology Analysis

If "trend was seen but no significant difference," AI image analysis of existing slides can be a useful re-analysis approach before launching a new animal study. Continuous-value quantification of fibrosis has been reported to capture subtle changes that semi-quantitative scoring may miss^{[ref-ai-pathology]}^{[ref-qfibrosis]}^{[ref-auto-fibrosis]}. AI re-analysis is not guaranteed to yield statistical significance — outcomes depend on the source-data signal-to-noise, sample size, and analysis pipeline.

AI Pathology for Objective Fibrosis Scoring

Strategy C: Utilization of High-Sensitivity Biomarkers

Fibrogenic "movement" may be changing before histology "appearance" changes. Dynamic biomarkers such as PRO-C3 (Type III collagen formation marker) may provide supportive / exploratory evidence of altered fibrogenesis at early time points where histology scores remain unchanged^{[ref-proc3-adapt]}^{[ref-proc3-collagen]}^{[ref-proc3-mre]}. PRO-C3 should be positioned as a complementary biomarker to histology, not as standalone proof of antifibrotic activity.

Non-invasive Biomarkers: FibroScan & PRO-C3

4. Decision Tree for Re-evaluation

We organized what action to take according to the type of "failure" you are facing.

Conclusion: Negative Data is Not the "End"

The fact that "reproducibility was not obtained" is not a death sentence for the compound. It is often just one result that "could not be measured with that study system."

The important thing is to factorize why it failed from pathological and metabolic mechanisms, and rebuild with appropriate models and latest measurement technologies.

References

Reproducibility Crisis

Begley CG, Ellis LM. Drug development: Raise standards for preclinical cancer research. Nature. 2012;483(7391):531-533. PubMed PMID 22460880
Prinz F, Schlange T, Asadullah K. Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov. 2011;10(9):712. PubMed PMID 21892149
Freedman LP, Cockburn IM, Simcoe TS. The Economics of Reproducibility in Preclinical Research. PLoS Biol. 2015;13(6):e1002165. PubMed PMID 26057340

NASH / MASH Models

Lee S, et al. The Role of the Histone Methyltransferase EZH2 in Liver Inflammation and Fibrosis in STAM NASH Mice. Biology (Basel). 2020;9(5):93. PubMed PMID 32370249 / DOI: 10.3390/biology9050093
THR-β activation in mouse NASH/fibrosis model. Br J Pharmacol. 2021. PubMed PMID 33655500 / DOI: 10.1111/bph.15427
Translational mouse model for NASH with advanced fibrosis and atherosclerosis. Cells. 2020;9(9):2014. PubMed PMID 32883049 / DOI: 10.3390/cells9092014

IPF / Bleomycin Models

Bleomycin model as active IPF-like disease model. PLoS ONE. 2013. PubMed PMID 23565148 / DOI: 10.1371/journal.pone.0059348
Klee S, et al. Transcriptomic and proteomic profiling of young and old mice in the bleomycin model reveals high similarity. Am J Physiol Lung Cell Mol Physiol. 2023;324(3):L245-L258. PubMed PMID 36625483 / DOI: 10.1152/ajplung.00253.2021
Young mouse bleomycin model lipid/metabolite study. BMC Pulm Med. 2022. PubMed PMID 35509094 / DOI: 10.1186/s12890-022-01972-6
Jenkins RG, Moore BB, Chambers RC, et al. An Official American Thoracic Society Workshop Report: Use of Animal Models for the Preclinical Assessment of Potential Therapies for Pulmonary Fibrosis. Am J Respir Cell Mol Biol. 2017;56(5):667-679. PubMed PMID 28459387 / DOI: 10.1165/rcmb.2017-0096ST

AI / Quantitative Pathology

Abdurrachim D, et al. Utility of AI digital pathology as an aid for pathologists scoring fibrosis in MASH. J Hepatol. 2025;82(5):898-908. PubMed PMID 39612947 / DOI: 10.1016/j.jhep.2024.11.032
qFibrosis quantitative fibrosis scoring. J Hepatol. 2014. PubMed PMID 24583249 / DOI: 10.1016/j.jhep.2014.02.015
Automated fibrosis quantification in NAFLD. PubMed PMID 32531442

PRO-C3 / Biomarkers

PRO-C3 / ADAPT in NASH screening (CENTAUR population). JHEP Rep. 2021. PubMed PMID 34454994 / DOI: 10.1016/j.jhepr.2021.100330
PRO-C3 and collagen fragments in NASH. Dig Dis Sci. 2018. PubMed PMID 30120271 / DOI: 10.1007/s10620-018-5219-9
PRO-C3 and MRE-assessed fibrosis in NAFLD. Hepatology. 2019. PubMed PMID 30859582 / DOI: 10.1002/hep.30455

In drug development, the most regrettable thing is judging 'no effect' due to inadequate study systems rather than the drug's own effect (False Negative).

1. Why Do Studies Fail? The Trap of "Model Selection Mistakes"

Many cases of "no reproducibility" have a common cause. It is the mismatch between the drug's Mechanism of Action (MoA) and the pathophysiology of the animal model.

Lack of Metabolic Background (MASH/NASH Example)

Influence of Spontaneous Healing (IPF Example)

Comparison of Clinical Correlation

Feature	MCD Diet Model (Traditional)	STAM™ Model (Metabolic background + fibrosis progression)	Re-evaluation Perspective
Metabolic Background	Weight loss, Hypoglycemia	STZ + HFD-derived diabetes / dyslipidemia (obesity is limited due to the STZ component)^{[ref-stam-nash]}	MCD can be poorly suited for evaluating metabolic drugs; consider STAM or related models for MoAs requiring a metabolic background.
Fibrosis Progression	Rapid but artificial	Progresses over several weeks (reported timing varies by institution and strain)	Choose based on whether human-like progression is required by your MoA.
Clinical Correlation	Low (Similar pathology only)	Combined histology + metabolic readouts are feasible; head-to-head comparison data are limited and partly rely on in-house data from the providing institution	When prioritising clinical predictability, also request historical data disclosure from the CRO when selecting a model.

2. Limits of "Analog Eyes": Insufficient Sensitivity of Evaluation Systems

Even if you choose an appropriate model, you cannot scoop up gold dust (efficacy) if the evaluation tool is a "colander."

Inter-observer variability: Score changes depending on the evaluator or timing.
Low sensitivity (Discontinuity): Even if fibrosis area decreases from 15% to 10%, it is judged as the same "Stage 3" on score, risking "No change."

Missing slight but certain drug efficacy. This is very often the hidden identity of negative data.

*Concept diagram: Bar graph (Bar) images score evaluation by humans (hard to see difference). Line graph (Line) images quantitative analysis by AI (detects difference sensitively). Not actual data.

For researchers tracking fibrosis & inflammation R&D

FDA approval alerts, trial readouts, preclinical model selection, and assay optimization — curated signal for bench-to-pipeline readers. 2 emails/month max.

3. 3 Re-evaluation Strategies: Picking Up Signals with High-Precision Technology

To rebuild a failed program, you need to review the pathology with a higher resolution lens.

4. Decision Tree for Re-evaluation

We organized what action to take according to the type of "failure" you are facing.

Conclusion: Negative Data is Not the "End"

The fact that "reproducibility was not obtained" is not a death sentence for the compound. It is often just one result that "could not be measured with that study system."

The important thing is to factorize why it failed from pathological and metabolic mechanisms, and rebuild with appropriate models and latest measurement technologies.

References

Reproducibility Crisis

Begley CG, Ellis LM. Drug development: Raise standards for preclinical cancer research. Nature. 2012;483(7391):531-533. PubMed PMID 22460880
Prinz F, Schlange T, Asadullah K. Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov. 2011;10(9):712. PubMed PMID 21892149
Freedman LP, Cockburn IM, Simcoe TS. The Economics of Reproducibility in Preclinical Research. PLoS Biol. 2015;13(6):e1002165. PubMed PMID 26057340

NASH / MASH Models

Lee S, et al. The Role of the Histone Methyltransferase EZH2 in Liver Inflammation and Fibrosis in STAM NASH Mice. Biology (Basel). 2020;9(5):93. PubMed PMID 32370249 / DOI: 10.3390/biology9050093
THR-β activation in mouse NASH/fibrosis model. Br J Pharmacol. 2021. PubMed PMID 33655500 / DOI: 10.1111/bph.15427
Translational mouse model for NASH with advanced fibrosis and atherosclerosis. Cells. 2020;9(9):2014. PubMed PMID 32883049 / DOI: 10.3390/cells9092014

IPF / Bleomycin Models

Bleomycin model as active IPF-like disease model. PLoS ONE. 2013. PubMed PMID 23565148 / DOI: 10.1371/journal.pone.0059348
Klee S, et al. Transcriptomic and proteomic profiling of young and old mice in the bleomycin model reveals high similarity. Am J Physiol Lung Cell Mol Physiol. 2023;324(3):L245-L258. PubMed PMID 36625483 / DOI: 10.1152/ajplung.00253.2021
Young mouse bleomycin model lipid/metabolite study. BMC Pulm Med. 2022. PubMed PMID 35509094 / DOI: 10.1186/s12890-022-01972-6
Jenkins RG, Moore BB, Chambers RC, et al. An Official American Thoracic Society Workshop Report: Use of Animal Models for the Preclinical Assessment of Potential Therapies for Pulmonary Fibrosis. Am J Respir Cell Mol Biol. 2017;56(5):667-679. PubMed PMID 28459387 / DOI: 10.1165/rcmb.2017-0096ST

AI / Quantitative Pathology

Abdurrachim D, et al. Utility of AI digital pathology as an aid for pathologists scoring fibrosis in MASH. J Hepatol. 2025;82(5):898-908. PubMed PMID 39612947 / DOI: 10.1016/j.jhep.2024.11.032
qFibrosis quantitative fibrosis scoring. J Hepatol. 2014. PubMed PMID 24583249 / DOI: 10.1016/j.jhep.2014.02.015
Automated fibrosis quantification in NAFLD. PubMed PMID 32531442

PRO-C3 / Biomarkers

PRO-C3 / ADAPT in NASH screening (CENTAUR population). JHEP Rep. 2021. PubMed PMID 34454994 / DOI: 10.1016/j.jhepr.2021.100330
PRO-C3 and collagen fragments in NASH. Dig Dis Sci. 2018. PubMed PMID 30120271 / DOI: 10.1007/s10620-018-5219-9
PRO-C3 and MRE-assessed fibrosis in NAFLD. Hepatology. 2019. PubMed PMID 30859582 / DOI: 10.1002/hep.30455

Preclinical Study Rescue: 3 Re-evaluation Strategies

1. Why Do Studies Fail? The Trap of "Model Selection Mistakes"

Lack of Metabolic Background (MASH/NASH Example)

Influence of Spontaneous Healing (IPF Example)

Comparison of Clinical Correlation

2. Limits of "Analog Eyes": Insufficient Sensitivity of Evaluation Systems

For researchers tracking fibrosis & inflammation R&D

3. 3 Re-evaluation Strategies: Picking Up Signals with High-Precision Technology

Strategy A: Switching to Clinical Correlation Model (STAM™)

Strategy B: Slide Re-evaluation by AI Pathology Analysis

Strategy C: Utilization of High-Sensitivity Biomarkers

4. Decision Tree for Re-evaluation

Conclusion: Negative Data is Not the "End"

Further Reading

References

Reproducibility Crisis

NASH / MASH Models

IPF / Bleomycin Models

AI / Quantitative Pathology

PRO-C3 / Biomarkers

For researchers tracking fibrosis & inflammation R&D

Stay connected with Fibrosis-Inflammation Lab

Related Articles

Renal Fibrosis Models: UUO, Adenine, IRI & More

Lung Fibrosis Mouse Model Selection Guide 2026

Sirius Red vs Hydroxyproline vs Trichrome 2026

Preclinical Study Rescue: 3 Re-evaluation Strategies

1. Why Do Studies Fail? The Trap of "Model Selection Mistakes"

Lack of Metabolic Background (MASH/NASH Example)

Influence of Spontaneous Healing (IPF Example)

Comparison of Clinical Correlation

2. Limits of "Analog Eyes": Insufficient Sensitivity of Evaluation Systems

For researchers tracking fibrosis & inflammation R&D

3. 3 Re-evaluation Strategies: Picking Up Signals with High-Precision Technology

Strategy A: Switching to Clinical Correlation Model (STAM™)

Strategy B: Slide Re-evaluation by AI Pathology Analysis

Strategy C: Utilization of High-Sensitivity Biomarkers

4. Decision Tree for Re-evaluation

Conclusion: Negative Data is Not the "End"

Further Reading

References

Reproducibility Crisis

NASH / MASH Models

IPF / Bleomycin Models

AI / Quantitative Pathology

PRO-C3 / Biomarkers

For researchers tracking fibrosis & inflammation R&D

Stay connected with Fibrosis-Inflammation Lab

Related Articles

Renal Fibrosis Models: UUO, Adenine, IRI & More

Lung Fibrosis Mouse Model Selection Guide 2026

Sirius Red vs Hydroxyproline vs Trichrome 2026