This function processes chromosome arm and sub-segment data to generate boundary information for specified chromosomes. It combines segments from both input tables, identifies unique boundaries, and creates windows for each chromosome. Additionally, it constructs a genome-wide vector of window identifiers.
generate_chr_boundary_data(chr_arm_table, sub_seg_list, chr_names)A data frame containing chromosome arm information:
chrom: Chromosome name
start: Start position
end: End position
region_name: Identifier for the chromosome arm
A list of data frames (by haplotype) containing sub-segment information:
chrom: Chromosome name
ref_start: Reference start position
ref_end: Reference end position
region_name: Segment identifier
A character vector of chromosome names to process
A list containing three elements:
segment_info: A list of data frames, one for each chromosome, containing combined
segments from chr_arm_table and sub_seg_list.
windows_info: A list of data frames, one for each chromosome, containing windows derived from the unique boundaries of the combined segments.
genome_window_vector: Vector of formatted window strings ("chr_start_end")
The function performs these steps for each chromosome:
Extracts and combines segments from chromosome arms and sub-segments
Generates windows based on unique boundary positions
Creates formatted window identifiers
Organizes results by chromosome
Windows are generated between each unique boundary point found in the combined segment data, ensuring all relevant genomic intervals are captured.
if (FALSE) { # \dontrun{
chr_arms <- data.frame(
chrom = c("chr1", "chr1"),
start = c(1, 1000000),
end = c(1000000, 2000000),
region_name = c("chr1p", "chr1q")
)
sub_segs <- list(
maternal = data.frame(...),
paternal = data.frame(...)
)
chr_names <- c("chr1")
boundaries <- generate_chr_boundary_data(chr_arms, sub_segs, chr_names)
} # }