Article
2026-02-12

Bringing Liver Biopsy to Preclinical Studies — The Value of Biopsy-Confirmed Study Design

Incorporating in-life liver biopsy into preclinical MASH studies enables paired (intra-individual) analysis, dramatically improving alignment with clinical trial design. We detail the surgical technique, statistical advantages, and AI-powered scoring of biopsy-confirmed study designs.

Reviewed by Fibrosis-Inflammation Lab Scientific Team

Bringing Liver Biopsy to Preclinical Studies

The Design Gap Between Clinical and Preclinical Trials

In MASH (Metabolic Dysfunction-Associated Steatohepatitis) clinical trials, paired biopsy (pre- and post-treatment liver biopsy comparison) is the standard assessment method. The same patient's liver tissue before and after treatment is directly compared to determine "improvement" in fibrosis or NAS score.

In preclinical studies, however, the situation is different. The vast majority of animal studies rely on terminal-only assessment — the pre-treatment state is merely estimated from group averages.

This design gap creates the following problems.

The Limitations of Terminal-Only Assessment

Problem 1: Individual Variability Becomes "Noise"

In diet-induced MASH models (e.g., GAN diet), high inter-individual variability is well-documented despite identical protocols.

For example, after 24 weeks on GAN diet:

  • Mouse A reaches NAS=6, Fibrosis F2
  • Mouse B stays at NAS=3, Fibrosis F1
  • Mouse C progresses to NAS=7, Fibrosis F3

When conducting group comparisons amid this variability, significantly larger sample sizes are needed to detect meaningful differences.

Problem 2: Cannot Distinguish "Improvement" from "Never Progressed"

With terminal-only data, a mouse with NAS=3 at study end could be either "successfully treated (NAS 6→3)" or "inherently mild (NAS 3→3)" — there is no way to tell.

Problem 3: Misalignment with Clinical Endpoints

Clinical trial primary endpoints are defined by intra-individual change — e.g., "≥2-point NAS improvement without fibrosis worsening" or "≥1-stage fibrosis improvement." Terminal-only preclinical data cannot directly replicate these endpoints.

The Solution: Biopsy-Confirmed Study Design

These problems are addressed by incorporating in-life liver biopsy into the study design.

Procedure Overview

  1. MASH Induction (12-16 weeks): Establish MASH using GAN diet
  2. In-life Liver Biopsy (Wedge Biopsy): Remove approximately 30-50 mg (less than 5% of total liver) as a wedge from the left lateral lobe
  3. Pathology Confirmation + Stratification: Evaluate biopsy tissue, select only animals meeting inclusion criteria (e.g., Steatosis ≥2, Fibrosis ≥1), then stratify by disease severity across groups
  4. Treatment Period (4-8 weeks): Compound administration
  5. Terminal Assessment: Directly compare terminal liver tissue against baseline (Paired Analysis)

Procedure Safety Profile

ParameterDetails
Tissue Removed30-50 mg (< 5% of total liver)
Biopsy SiteLeft lateral lobe
HemostasisAbsorbable gelatin sponge
Procedure Duration~5-10 min per mouse
Mortality RateReported at 0%
Recovery TimeNormal feeding resumes next day

Statistical Advantage: Why Paired Analysis is Powerful

Unpaired (Conventional) vs. Paired (Biopsy-Confirmed)

In conventional unpaired tests, inter-group variability reduces statistical power. In paired tests, individual variation is canceled out by analyzing within-subject change (Δ).

Conventional (Unpaired):
  Vehicle group: NAS = 5.2 ± 1.8  (terminal)
  Treatment group: NAS = 4.0 ± 1.6  (terminal)
  → p = 0.12 (not significant)

Biopsy-Confirmed (Paired):
  Vehicle group: ΔNAS = +0.3 ± 0.8  (worsened)
  Treatment group: ΔNAS = -1.8 ± 0.9  (improved)
  → p = 0.001 (highly significant)

The same treatment effect can be detected with fewer animals using paired design — aligning with the 3Rs principle (Replacement, Reduction, Refinement).

Enabling Responder Analysis

In clinical trials, "the proportion of patients showing ≥2-point NAS improvement (Responder Rate)" is a critical endpoint. Biopsy-confirmed design allows replication of individual-level responder analysis in preclinical settings.

GHOST: AI-Powered Objective Scoring

Gubra has developed a deep learning-based application called GHOST (Gubra Histopathological Objective Scoring Technology) to complement biopsy-confirmed designs.

GHOST enables:

  • Elimination of inter-observer variability in NAS scoring
  • Quantitative, reproducible assessment of steatosis, inflammation, and ballooning
  • Consistent criteria across pre- and post-treatment comparisons

This further strengthens the reliability of paired analysis.

Caveats and Limitations

  1. Local Surgical Effects: Tissue repair at the biopsy site may cause localized fibrosis, so terminal assessment should be performed from a different lobe.
  2. Sampling Bias: The liver is not uniformly affected, so lesion distribution may differ between biopsy and terminal sites (notably, this same limitation applies to clinical paired biopsies).
  3. Technical Demands: The procedure requires high surgical skill and post-operative care — not all facilities can perform it.

Conclusion

Biopsy-confirmed study design transforms preclinical MASH research from "group average comparisons" to "individual change tracking."

By aligning with clinical trial design, it bridges the translational gap and increases the Probability of Success (PoS) in clinical development. To identify the "true efficacy" of your compound, this approach deserves serious consideration.



References

  1. Tølbøl KS, et al. World J Gastroenterol. 2018. (GAN DIO-NASH model with biopsy-confirmed design)
  2. Boland ML, et al. J Lipid Res. 2019. (Biopsy-based assessment in GAN model)
  3. Roth JD, et al. PLoS One. 2019. (Biopsy-confirmed ob/ob-NASH model)
  4. Gubra. GHOST: Gubra Histopathological Objective Scoring Technology. (Deep learning-based scoring)

Related Articles