This function performs a sanity check on chromosome segments to ensure that the total length of the segments matches the expected chromosome lengths for both maternal and paternal haplotypes. It checks each chromosome and haplotype combination and reports any discrepancies.

segments_sanity_check(segment_list, chr_lengths)

Arguments

segment_list

A list containing two data frames (maternal and paternal), each with columns:

  • chrom - Chromosome name

  • ref_start - Segment start position

  • ref_end - Segment end position

  • CN_change - Copy number change (-1 indicates segments to exclude)

chr_lengths

A list containing two named numeric vectors (maternal and paternal) where names are chromosome identifiers and values are chromosome lengths

Value

A list where:

  • Names are concatenated strings of haplotype and chromosome (e.g., "maternal chr1")

  • Values are error messages for failed checks

  • Empty list indicates all checks passed

Details

The function performs the following checks for each chromosome in both haplotypes:

  1. Excludes segments with CN_change == -1

  2. Sums the lengths of remaining segments (ref_end - ref_start + 1)

  3. Compares total segment length with expected chromosome length

  4. Records any mismatches in the returned list

See also

Related functions for segment manipulation and validation

Examples

if (FALSE) { # \dontrun{
# Create sample segment list
segments <- list(
  maternal = data.frame(
    chrom = c("chr1", "chr1"),
    ref_start = c(1, 101),
    ref_end = c(100, 200),
    CN_change = c(0, 1)
  ),
  paternal = data.frame(
    chrom = c("chr1", "chr1"),
    ref_start = c(1, 101),
    ref_end = c(100, 200),
    CN_change = c(0, 0)
  )
)

# Create sample chromosome lengths
chr_lengths <- list(
  maternal = c(chr1 = 200),
  paternal = c(chr1 = 200)
)

# Run sanity checks
results <- segments_sanity_check(segments, chr_lengths)
if (length(results) == 0) {
  print("All checks passed")
} else {
  print("Some checks failed:")
  print(results)
}
} # }