This function extends a copy number (CN) data frame (seg_dat) to ensure that each genomic window is divided into equal-sized bins of a specified size (bin_unit). It replicates columns in the input data frame based on the size of each window and the desired bin size, creating a new data frame with expanded columns.

extend_blueprint_cn_for_equal_bin(seg_dat, window_sizes, bin_unit)

Arguments

seg_dat

A data frame of copy number segments where:

  • Rows represent clones/samples

  • Columns represent genomic windows

  • Values represent copy numbers

window_sizes

A numeric vector containing the size of each window in base pairs, must match the number of columns in seg_dat

bin_unit

The desired size of each bin in base pairs

Value

A data frame with:

  • Same number of rows as input

  • Expanded columns based on window sizes and bin unit

  • Column names as "original_window_name_binIndex"

  • Original row names preserved

Details

The function:

  1. Calculates how many bins each window needs based on its size

  2. Replicates copy number values to fill the required bins

  3. Maintains data continuity while creating equal-sized segments

Note

Window sizes that aren't exact multiples of bin_unit will be rounded up to ensure complete coverage

Examples

if (FALSE) { # \dontrun{
seg_data <- data.frame(
  chr1_1000_3000 = c(2, 1),
  chr1_3000_4000 = c(1, 1),
  row.names = c("clone1", "clone2")
)
window_sizes <- c(2000, 1000)
bin_unit <- 1000

expanded <- extend_blueprint_cn_for_equal_bin(seg_data, window_sizes, bin_unit)
# Returns expanded data frame with bins of 1000bp each
} # }