R/snp_relevant_functions.R
vcf_to_snp_list.RdTakes a phased VCF table and converts it into a list of filtered SNP data frames, separated by genotype categories (homozygous alternative and heterozygous variants).
vcf_to_snp_list(phased_vcf_table, sample_name)A list containing four data frames:
all: All SNPs (single-nucleotide variants only)
1|1: Homozygous alternative variants
1|0: Heterozygous variants with alternative allele on first haplotype
0|1: Heterozygous variants with alternative allele on second haplotype
The function filters for single-nucleotide variants only (where REF and ALT are single characters) and separates the genotype field (GT) from the dosage field (DS) in the sample column.
vcf_df <- data.frame(
CHROM = c("chr1", "chr1"),
POS = c(1000, 2000),
REF = c("A", "C"),
ALT = c("G", "T"),
sample1 = c("1|0:0.5", "0|1:0.5")
)
snp_lists <- vcf_to_snp_list(vcf_df, "sample1")
#> Warning: restarting interrupted promise evaluation
#> Warning: internal error -3 in R_decompress1
#> Error: lazy-load database '/Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/library/cancerSimCraft/R/cancerSimCraft.rdb' is corrupt