Graduating from the Ashcroft Score: How AI Pathology Solves Variability in Fibrosis Assessment
'Different scores by different pathologists', 'Lack of reproducibility'. How does digital pathology using AI (Deep Learning) solve the limits of subjective subjective scoring? A thorough comparison between manual evaluation and AI, and HALO/QuPath use cases.
Introduction: The Biggest Bottleneck in Fibrosis Assessment
A long-standing headache in the efficacy evaluation of pulmonary fibrosis (e.g., IPF models) and liver fibrosis (e.g., MASH models) is the "subjectivity" and "variability" of pathology assessment.
"Pathologist A graded it a 3, but Pathologist B graded it a 2." Or "The same pathologist reviewed it a week later and changed the score." These fluctuations in assessment drown valuable compound efficacy data (P-values) in noise, causing major "Type II Errors" where promising drug candidates are overlooked.
In this article, we explain the limitations of classical scoring systems like the Ashcroft Score, and the paradigm shift toward "pixel-level full quantitative analysis" using AI (Artificial Intelligence / Deep Learning) technology.
1. The Limits of Traditional "Manual Scoring"
The Ashcroft Score (a 0-8 grade scale) commonly used to evaluate Idiopathic Pulmonary Fibrosis (IPF) models, and the NAS score for liver fibrosis, are simple to use but suffer from structural flaws.
① Inter/Intra-observer Variability
No matter how experienced a pathologist is, it is inevitable that their judgment will fluctuate depending on the day or the field of view. Especially in ambiguous cases like the "borderline between grade 3 and 4", the judgment is left to subjectivity.
② Difficulty in "Global" Assessment
It is extremely difficult for a human to uniformly evaluate an entire tissue slide (Whole Slide). Humans are unconsciously biased to focus on "areas with severe lesions" that have strong visual features, underestimating mild but widespread lesions.
2. Third-Generation Analysis: AI & Digital Pathology
Today, advanced image analysis platforms like HALO® (Indica Labs) and QuPath have become widespread in both preclinical and clinical settings, deeply automating and objectifying fibrosis assessment.
What is AI Looking At?
AI (machine learning/deep learning models) imports the entire glass slide as a digital WSI (Whole Slide Image) and analyzes it through the following processes:
- Tissue Segmentation (Tissue Classifier): The AI automatically recognizes the background (blank space), normal alveoli, bronchi, blood vessels, etc., and masks (excludes) them from the evaluation target.
- Pixel-level Quantification: Instead of a vague "3 points", it calculates absolute and continuous numerical values (% Area), such as "Out of a total tissue area of 50 mm², the collagen area stained with Masson's trichrome (blue) is 12.5 mm²."
- Detection of Microstructures: It quantifies slight thickening of alveolar walls or minor differences in interstitial cell density that are easily missed by the human eye.
3. In-Depth Comparison: Manual Evaluation vs AI / Digital Pathology
The table below shows the decisive differences between traditional methods and AI analysis in drug efficacy evaluation.
| Comparison Item | Manual Evaluation (e.g., Ashcroft Score) | AI / Digital Pathology Analysis |
|---|---|---|
| Nature of Assessment | Subjective, Discontinuous (Ordinal: 0, 1, 2...) | Objective, Continuous (Ratio: e.g., 12.5%) |
| Reproducibility | Low to Medium (Depends on evaluator/condition) | Extremely High (Same result every time) |
| Sensitivity | Low (Hard to detect fine efficacy) | Extremely High (Detects even slight %Area reduction) |
| Throughput | Slow (Visual check of one slide at a time) | Very Fast (Batch processing via Cloud/GPU) |
| Whole Slide Assessment | Difficult (Tends to sample a few FOVs) | Possible (Full WSI pixel-by-pixel scan) |
| Required N (Statistical Power) | Increased due to large data variance | Reducible due to small CV (Contributes to 3Rs) |
Case Study: AI Uncovers "Hidden Efficacy"
Here is an example from actual comparison study data evaluating an anti-fibrotic drug administered to a bleomycin pulmonary fibrosis model:
| Evaluation Method | P-value (Placebo vs Treated) | Conclusion |
|---|---|---|
| Ashcroft Score (Human) | p = 0.08 | Not Significant (Large variance, unable to prove efficacy) |
| AI Image Analysis (% Fibrosis Area) | p = 0.03 | Significant (Improved S/N ratio, solid efficacy detected) |
By using AI, the S/N (Signal/Noise) ratio of the data improves dramatically, making it possible to detect significant differences with a smaller N (number of animals). This directly contributes to both cost reduction and animal welfare (3Rs).
4. "Augmented Intelligence": Collaborative Model of Pathologists and AI
It is often misunderstood, but AI does not take away the jobs of specialized pathologists; rather, it "augments" their capabilities.
- Role of AI: As a tireless calculator, it executes area calculations of vast entire tissues, cell counting, and biomarker positivity calculations rapidly and accurately.
- Role of the Pathologist: Performs quality control (QC) to ensure the AI is recognizing correctly, excludes artifacts (e.g., tissue folds), and provides "Biological Interpretation" - why such lesions were formed.
In advanced drug discovery projects, delivering a hybrid report combining "objective quantitative data by AI" + "findings and discussion by a board-certified Pathologist" provides an incredibly robust data package capable of withstanding regulatory submissions (PMDA/FDA).
5. Conclusion: Bringing "Conviction" to Efficacy Evaluation with Objective Data
"The In vivo study did not go well", "The data is all over the place". The cause might not be the compound (drug), but rather the limits of the "evaluation method".
Let's end the era of agonizing over subjective score fluctuations. Objective, reproducible digital pathology data powered by AI will serve as a reliable compass, guiding your drug discovery project to the next phase (clinical trials).
Related Articles
- Fundamentals of Assessment and Scoring
- Staining Protocols
References
- Ashcroft T, et al. Simple method of estimating severity of pulmonary fibrosis on a numerical scale. J Clin Pathol. 1988. PubMed
- Hadi AM, et al. Rapid quantification of myocardial fibrosis: a new macro-based automated analysis. Int J Exp Pathol. 2011. PubMed
- Brey EM, et al. Automated selection of DAB-labeled tissue for immunohistochemical quantification. J Histochem Cytochem. 2003. PubMed