doi:http://dx.doi.org/10.5705/ss.2010.257
Abstract: Change-point models have been widely applied for segmentation of spatial or time-series data. Some recent applications in genomics motivate multi-sequence change-point models for shared changes across multiple aligned sequences. These applications frequently involve data where the number of change-points can be large. In a previous paper we derived a Bayes Information Criterion (BIC) for determining the number of changes in the mean of a sequence of independent normal observations when the number of change-points is assumed to remain bounded as the number of observations increases. Here we extend that result to the case where can increase with the sample size and to simultaneous change-points in multiple sequences. Stochastic terms that enter into the new criteria involve integrals and maxima of two-sided random walks with negative drift. The new criteria are applied to the analysis of DNA copy number data.
Key words and phrases: Change-point detection, DNA copy number, segmentation, model selection.