This function processes chromosome arm and sub-segment data to generate boundary information for specified chromosomes. It combines segments from both input tables, identifies unique boundaries, and creates windows for each chromosome. Additionally, it constructs a genome-wide vector of window identifiers.

generate_chr_boundary_data(chr_arm_table, sub_seg_list, chr_names)

Arguments

chr_arm_table

A data frame containing chromosome arm information:

  • chrom: Chromosome name

  • start: Start position

  • end: End position

  • region_name: Identifier for the chromosome arm

sub_seg_list

A list of data frames (by haplotype) containing sub-segment information:

  • chrom: Chromosome name

  • ref_start: Reference start position

  • ref_end: Reference end position

  • region_name: Segment identifier

chr_names

A character vector of chromosome names to process

Value

A list containing three elements:

  • segment_info: A list of data frames, one for each chromosome, containing combined segments from chr_arm_table and sub_seg_list.

  • windows_info: A list of data frames, one for each chromosome, containing windows derived from the unique boundaries of the combined segments.

  • genome_window_vector: Vector of formatted window strings ("chr_start_end")

Details

The function performs these steps for each chromosome:

  1. Extracts and combines segments from chromosome arms and sub-segments

  2. Generates windows based on unique boundary positions

  3. Creates formatted window identifiers

  4. Organizes results by chromosome

Windows are generated between each unique boundary point found in the combined segment data, ensuring all relevant genomic intervals are captured.

Examples

if (FALSE) { # \dontrun{
chr_arms <- data.frame(
  chrom = c("chr1", "chr1"),
  start = c(1, 1000000),
  end = c(1000000, 2000000),
  region_name = c("chr1p", "chr1q")
)
sub_segs <- list(
  maternal = data.frame(...),
  paternal = data.frame(...)
)
chr_names <- c("chr1")
boundaries <- generate_chr_boundary_data(chr_arms, sub_segs, chr_names)
} # }