OMOP Whole Slide Imaging

Summary:

The OMOP _whole_slide_imaging table contains records of digitized pathology slides, including structured metadata, access paths to whole slide image files, and links to structured EHR data. This table serves as a proof of concept demonstrating how whole slide imaging files can be integrated and analyzed within the OMOP framework alongside clinical data.

Each record represents a single whole slide image with associated metadata including:

  • Patient identifier (person_id)
  • Accession number linking to clinical specimens
  • URIs to both JSON metadata and TIFF image files
  • Specimen source and type classifications
  • Procedure timing information
  • Links to clinical notes

Currently, the table contains manually scanned slides that have been deidentified and converted into TIFF and JSON file formats. Only slides with order_proc_ids that can be linked to Epic Clarity are included to ensure data quality and patient eligibility. The February 2026 release includes approximately 50 whole slide images in the OMOP table, with an additional 35,000 images available in a standalone ad-hoc table that are not yet integrated into OMOP. Future releases are expected to include images from additional sources and further integration of the ad-hoc dataset.

Please note that the whole slide imaging dataset is in an early stage of development, and will continue to evolve as we work towards integrating more comprehensive imaging datasets in the future releases.

📊 Dataset Overview

  • Total Images: 50
  • Unique Patients: 5
  • Unique Accessions: 5

📁 File Availability

  • Images with JSON: 50 (100%)
  • Images with TIFF: 50 (100%)