This function verifies that the alternate alleles (ALT) in the SNP list match the corresponding positions in the synthetic genome. It ensures that the SNPs have been correctly inserted into the synthetic genome for both maternal and paternal haplotypes. The function checks each segment (e.g., chromosome) and issues warnings if mismatches are found.

check_alt_snp_match(seg_names, snp_list, sim_genome)

Arguments

seg_names

Character vector of segment/chromosome names to check

snp_list

A nested list where first level contains segment names and second level contains data frames for different genotypes ('1|1', '1|0', '0|1') with POS and ALT columns

sim_genome

A nested list containing maternal and paternal sequences for each segment, structured as sim_genome$maternal[seg_name] and sim_genome$paternal[seg_name]

Value

No return value, called for side effects:

  • Prints confirmation message for each matching segment and haplotype

  • Issues warning if mismatches are found

Details

For each segment and haplotype, the function:

  1. Combines appropriate homozygous and heterozygous SNPs

  2. Removes SNPs with duplicated positions

  3. Extracts sequences at SNP positions from simulated genome

  4. Compares extracted sequences with expected alternative alleles

  5. Reports matches/mismatches via messages and warnings

Maternal haplotype checks include '1|1' and '1|0' variants, while paternal haplotype checks include '1|1' and '0|1' variants.

Examples

if (FALSE) { # \dontrun{
# Example simulated genome
sim_genome <- list(
  maternal = list(chr1 = DNAStringSet("GCTGACTGACTG")),
  paternal = list(chr1 = DNAStringSet("ACTGACTCACTG"))
)

# Example SNP list
snp_list <- list(
  chr1 = list(
    "1|1" = data.frame(POS = 1, ALT = "G"),
    "1|0" = data.frame(POS = 5, ALT = "T"),
    "0|1" = data.frame(POS = 8, ALT = "C")
  )
)

check_alt_snp_match("chr1", snp_list, sim_genome)
} # }