Keratinocyte cancers (KC) are the most prevalent cancers in Australia, posing a growing burden on the healthcare system. Despite their high incidence, KC cases are underrepresented in cancer registries, complicating efforts to quantify disease burden. Identification and classification of KC subtypes typically require manual review of pathology and medical records, a time-consuming process. As a result, valuable clinical information often remains underutilized, limiting the feasibility of large-scale GWAS and genetic risk score prediction.
AI offers promising solutions for automating disease phenotyping from unstructured clinical data. Previous work demonstrated that supervised Machine Learning could accurately identify KC cases and subtypes from pathology reports. However, these older models relied on manual feature engineering and fixed text representations, limiting their adaptability and contextual understanding.
Large Language Models overcome these limitations by capturing deeper context in language, which helps them generate more accurate and meaningful responses. This study benchmarks LLM performance against earlier ML models using pathology data. We assess improvements in classification accuracy, information extraction, and efficiency. Our findings aim to advance automated diagnostic classification in large-scale health datasets, supporting scalable and efficient phenotype extraction. This data will allow for more powerful downstream analyses.