_variant_occurrence

Clinical Data Table - Genomic Variants

Table Columns

field	type	required	phi	phi scrubbing operation
variant_occurrence_id	INT64	Yes	Yes	Sub Not Stable between Data Refreshes
person_id	INT64	Yes	Yes	Sub with Stable between Data Refreshes
visit_occurrence_id	INT64	No	Yes	Sub with Stable between Data Refreshes
procedure_occurrence_id	INT64	No	Yes	Sub Not Stable between Data Refreshes
provider_id	INT64	No	Yes	Sub with Stable between Data Refreshes
order_datetime	DATETIME	No	Yes	Jitter
test_name	STRING	No		Whitelist
variant_name	STRING	No
variant_type	STRING	No		Whitelist
assessment	STRING	No		Whitelist
genome_assembly	STRING	No		Whitelist
chromosome	STRING	No
transcript_ref_seq	STRING	No
dna_change	STRING	No
dna_var_type	STRING	No		Whitelist
amino_acid_change	STRING	No
variant_molecular_consequence	STRING	No		Whitelist
genomic_dna_change	STRING	No
allelic_frequency	NUMERIC(18, 5)	No
copy_number_lower	NUMERIC(9, 2)	No
copy_number_upper	NUMERIC(9, 2)	No
gene_name	STRING	No
phenotype_spec_var_class	STRING	No		Whitelist
interpretation	STRING	No	Yes	TiDE
accession_number	STRING	No	Yes	Hash
stamp_pipeline_version	STRING	No		Whitelist
specimen_type	STRING	No		Whitelist
specimen_source	STRING	No		Whitelist

Columns Description

variant_occurrence_id (PK)

This is the primary key of the table. This is a unique identifier for every variant occurrence. It is assumed that every variant occurrence with a different unique identifier is in fact a different event and should be treated independently.

person_id (FK)

A foreign key identifier to the person_id in the person table for whom the condition is recorded.

visit_occurrence_id (FK)

A foreign key identifier to the visit_occurrence_id in the visit_occurrence table for the visit associated with the ordered test.

procedure_occurrence_id (FK)

A foreign key identifier to the OMOP procedure_occurrence table for the procedure order for the variant test.

provider_id (FK)

A foreign key identifier to the provider in the provider table who authorized the order for the variant test.

order_datetime

The datetime when the test was ordered.

test_name

The name of the test associated with the variant record.

variant_name

The name of the genetic variant identified in the test.

variant_type

The variant type, such as ‘Simple’, ‘Pharmacogenomic genotype’, ‘Negative’, etc.

assessment

The assessment of the variant, such as ‘Detected’, ‘Not Detected’, ‘Negative’, etc.

genome_assembly

The genome assembly used for the variant, such as ‘GRCh37’, ‘GRCh38’, ‘hg38’, etc.

chromosome

The chromosome on which the variant is located, such as ‘1’, ‘2’, ‘X’, ‘Y’, etc.

transcript_ref_seq

The external identifier defining the Transcript Reference Sequence.

dna_change

The change at the DNA level relative to the Transcript Reference Sequence.

dna_var_type

The descriptive name for the DNA sequence variation type, such as ‘Substitution’, ‘Copy number gain’, ‘Deletion’, etc.

amino_acid_change

The change at the amino acid (protein) level caused by the DNA change.

variant_molecular_consequence

The descriptive name for the molecular consequence of the variant, such as ‘Missense Variant’, ‘Nonsense’, ‘Frameshift Variant’, etc.

genomic_dna_change

The change at the DNA level relative to the Genomic Reference Sequence.

allelic_frequency

The percentage of all of the reads at this genomic location that were represented by the given allele. For homozygotes it will be close to 100%; for heterozygotes it will be close to 50%. It can be a smaller number when there are mosaics or multiple chromosomes, or mixtures of tumor cells and normal cells. It is stored in the system as a decimal between 0 and 1 - this is calculated by dividing the percentage by 100.

copy_number_lower

The lower bound of the copy number range for the variant.

copy_number_upper

The upper bound of the copy number range for the variant.

gene_name

The name of the gene associated with the variant record, such as ‘POLE’, ‘TP53’, ‘CYP2D6’, etc.

phenotype_spec_var_class

The descriptive name for the phenotype variant class, such as ‘Pathogenic’, ‘Likely Pathogenic’, ‘Uncertain Significance’, etc.

interpretation

The full text interpretation associated with the variant record, aggregated from individual interpretation lines.

accession_number

The specimen accession numbers associated with the ordered test, stored as a comma-separated string if multiple accession numbers are present.

stamp_pipeline_version

The version of the STAMP pipeline used for the test associated with the variant record.

specimen_type

The specimen type associated with the order for the variant record, such as ‘Blood’, ‘Tissue/Bone - Biopsy’, ‘Existing Patient Material’, etc.

specimen_source

The specimen source associated with the order for the variant record, such as ‘Blood, from Venipuncture’, ‘Saliva’, ‘Liver’, etc.