Department of Defense Biotechnology High Performance Computing Software Applications Institute Fort Detrick United States
Recent advances in the next-generation sequencing of B-cell receptors BCRs enable the characterization of humoral responses at a repertoire-wide scale and provide the capability for identifying unique features of immune repertoires in response to disease, vaccination, or infection. Immunosequencing now readily generates 103105 sequences per sample however, statistical analysis of these repertoires is challenging because of the high genetic diversity of BCRs and the elaborate clonal relationships among them. To date, most immunosequencing analyses have focused on reporting qualitative trends in immunoglobulin Ig properties, such as usage or somatic hypermutation SHM percentage of the Ig heavy chain variable IGHV gene segment family, and on reducing complex Ig property distributions to simple summary statistics. However, because Ig properties are typically not normally distributed, any approach that fails to assess the distribution as a whole may be inadequate in 1 properly assessing the statistical significance of repertoire differences, 2 identifying how two repertoires differ, and 3 determining appropriate confidence intervals for assessing the size of the differences and their potential biological relevance.
Frontiers in Immunology , 8, 910, 01 Jan 0001, 01 Jan 0001, Open access.