AI for Automatic Synoptic Reporting

CAP (College of American Pathologists) forms are standardized cancer reporting protocols that have revolutionized pathology practice by replacing inconsistent narrative reports with structured, synoptic formats containing essential diagnostic and prognostic information. Developed over 35 years ago to address significant variability in cancer reporting, these evidence-based protocols ensure complete, uniform documentation of malignant tumors across all healthcare institutions, directly improving patient outcomes and clinical decision-making.

Lung resection CAP forms are particularly critical in thoracic oncology, providing standardized reporting templates for primary lung cancers that include essential elements such as tumor size, histologic type and grade, surgical margins, lymph node status, and staging classifications. These lung-specific protocols have demonstrated measurable clinical impact, with studies showing that synoptic reporting achieves 88.4% completeness compared to only 2.6% for traditional descriptive reports, leading to more accurate staging, better treatment planning, and improved survival rates.

By establishing consistent terminology and data capture requirements, CAP lung resection forms enhance communication between pathologists and oncologists, ensure regulatory compliance with Commission on Cancer standards, and provide the structured data foundation necessary for personalized cancer care, targeted therapy selection, and multidisciplinary treatment coordination. The widespread adoption of these standardized protocols, supported by electronic integration into laboratory information systems, has positioned pathologists as key members of the lung cancer care team while enabling seamless data exchange for cancer registries, research, and quality improvement initiatives.

At Stanford, CAP forms were implemented within Epic using SmartForms. SmartForms is an Epic product that allows the capture of semi-structured information within the EHR that can later be used to generate free-form reports. In this case, the structured information from the CAP forms is captured on SmartForms, which later generate the synoptic reporting section within the pathology report. As a consequence, for relevant cases, the pathology report will contain this section.

For this task, we aim to use AI to populate the CAP forms automatically by using all the other elements from the pathology report. We formulated this as a question-answering problem, where the context for each question is the entire pathology report, the question is the particular CAP form element, and the answer is the value/selection to be populated.

For this initial experiment, we collected a dataset with 390 patients and 390 lung resection forms. The forms were reported between November 2022 and April 2025. The table below summarizes the demographics for this dataset.

FEMALE
(N=237)
MALE
(N=153)
Overall
(N=390)
Age
18-44 6 (2.5%) 10 (6.5%) 16 (4.1%)
45-59 47 (19.8%) 18 (11.8%) 65 (16.7%)
60-69 72 (30.4%) 42 (27.5%) 114 (29.2%)
70-79 87 (36.7%) 60 (39.2%) 147 (37.7%)
80+ 25 (10.5%) 23 (15.0%) 48 (12.3%)
Race
American Indian or Alaska Native 2 (0.8%) 1 (0.7%) 3 (0.8%)
Asian 74 (31.2%) 49 (32.0%) 123 (31.5%)
Black or African American 4 (1.7%) 2 (1.3%) 6 (1.5%)
Native Hawaiian or Other Pacific Islander 2 (0.8%) 2 (1.3%) 4 (1.0%)
Unknown 19 (8.0%) 21 (13.7%) 40 (10.3%)
White 136 (57.4%) 78 (51.0%) 214 (54.9%)
Ethnicity
Hispanic or Latino 20 (8.4%) 10 (6.5%) 30 (7.7%)
Not Hispanic or Latino 210 (88.6%) 141 (92.2%) 351 (90.0%)
Unknown 7 (3.0%) 2 (1.3%) 9 (2.3%)

Methods and Results

For this experiment, we used several state-of-the-art LLMs to assess their capabilities to extract the information required for lung resection CAP forms. As mentioned before, we formulated the problem as a question-answering problem. Each LLM was tested in a zero-shot setting, where no actual examples were given. Each LLM was asked to answer a single question using the entire pathology report (excluding the synoptic report) and instructions that contained the most recent lung resection CAP forms instructions. We asked the LLMs to provide the answers using JSON to facilitate the parsing of the actual answers.

To automatically evaluate the output, we used the traditional BERT score, which evaluates the semantic similarity between the generated answer and the reference answer. The results of this evaluation are shown in the figure below.

Here, we can see that Claude 3.7 is slightly better than Llama 4. However, the distribution for most LLMs is certainly wide, showing lack of consistency across all questions. To investigate this further, a heatmap was constructed and is shown in the figure below. Here, we can clearly see the variability across different questions and how GPT-4 consistently underperforms for this particular task. The questions itself and the descriptions of each question can be found below the graph.

CAP Form Questions

  1. ADDITIONAL FINDINGS - Documents other pathologic findings in the specimen beyond the primary tumor, such as atypical adenomatous hyperplasia, granulomatous inflammation, or emphysema.
  2. CLOSEST MARGIN(S) TO INVASIVE CARCINOMA - Identifies which specific surgical margin (bronchial, vascular, parenchymal, or chest wall) is nearest to the invasive tumor.
  3. DISTANCE FROM INVASIVE CARCINOMA TO CLOSEST MARGIN - Measures in centimeters how far the invasive tumor extends from the nearest surgical margin.
  4. HISTOLOGIC GRADE - Assesses the degree of tumor differentiation using grading schemes specific to tumor type (G1-well differentiated to G4-undifferentiated).
  5. LYMPH NODE(S) FROM PRIOR PROCEDURES - Documents whether lymph nodes from previous surgical procedures are included in the current specimen.
  6. LYMPHOVASCULAR INVASION - Reports the presence of tumor cells within lymphatic vessels, arteries, or veins.
  7. MARGIN STATUS FOR INVASIVE CARCINOMA - Determines whether invasive tumor is present at any surgical margin or if all margins are negative.
  8. MARGIN STATUS FOR NON-INVASIVE TUMOR - Assesses whether carcinoma in situ or lepidic components are present at surgical margins.
  9. NODAL SITE(S) EXAMINED - Documents which specific lymph node stations according to the IASLC map were sampled and examined.
  10. NODAL SITE(S) WITH TUMOR - Identifies which specific lymph node stations contain metastatic tumor.
  11. NUMBER OF LYMPH NODES EXAMINED - Provides the total count of lymph nodes examined in the specimen.
  12. NUMBER OF LYMPH NODES WITH TUMOR - Reports the count of lymph nodes containing metastatic tumor.
  13. PROCEDURE - Specifies the type of surgical resection performed (wedge resection, lobectomy, pneumonectomy, etc.).
  14. REGIONAL LYMPH NODE STATUS - Provides overall assessment of whether regional lymph nodes contain tumor or are negative.
  15. SPECIMEN LATERALITY - Indicates whether the lung specimen is from the right or left side.
  16. SPREAD THROUGH AIR SPACES (STAS) - Documents the presence of tumor cells extending beyond the main tumor into surrounding air spaces as micropapillary clusters, solid nests, or single cells.
  17. TNM DESCRIPTORS - Assigns pathologic TNM staging categories (pT, pN, pM) based on tumor size, nodal involvement, and distant metastasis.
  18. TOTAL TUMOR SIZE (SIZE OF ENTIRE TUMOR) - Measures the greatest dimension of the entire tumor including both invasive and non-invasive components.
  19. TUMOR FOCALITY - Determines whether there is a single tumor focus, multiple separate nodules, or multifocal disease.
  20. TUMOR SITE - Identifies the specific anatomic location of the tumor within the lung (upper lobe, lower lobe, bronchus, etc.).
  21. VISCERAL PLEURA INVASION - Assesses whether tumor penetrates beyond the elastic layer of the visceral pleura or extends to the pleural surface.

Further investigation is required to evaluate and characterize the failure modes of each of the LLMs. Additionally, the baseline performance indicates that fine-tuning an open-source model like Llama 4 may yield better results.