论文标题

长序列的HLA预测读取对齐,直接流入Hlaminer

HLA predictions from long sequence read alignments, streamed directly into HLAminer

论文作者

Warren, René L.

论文摘要

测序技术的快速变化的格局为基因组学研究带来了新的机会。较长的序列读取和较高的序列吞吐量,再加上不断改善的碱基精度和每座成本的降低,现在可以长期读取适合分析人类基因组的多态性区域,例如人白细胞抗原(HLA)基因复合物的多态性区域。在这里,我提出了一个简单的协议,用于通过将整个基因组shot弹枪(WGS)长度测序读数的hla签名直接流式传输顺序排列到Hlaminer中进行预测。该方法与运行minimap2一样简单,它可以与要对齐的序列数量缩放,并且可以与任何能够读取的读取器一起使用SAM格式输出的任何读取器,而无需将笨重的对齐文件存储到磁盘上。我展示了即使使用较旧且[基础]精确的WGS纳米孔数据集以及相对较低(10倍)序列覆盖范围的较旧且较少的预测,并提供了逐步的协议,以预测现代第三代技术的长期测序读取HLA I类和II基因。

The rapidly changing landscape of sequencing technologies brings new opportunities to genomics research. Longer sequence reads and higher sequence throughput coupled with ever-improving base accuracy and decreasing per-base cost is now making long reads suitable for analyzing polymorphic regions of the human genome, such as those of the human leucocyte antigen (HLA) gene complex. Here I present a simple protocol for predicting HLA signatures from whole genome shotgun (WGS) long sequencing reads, by directly streaming sequence alignments into HLAminer. The method is as simple as running minimap2, it scales with the number of sequences to align, and can be used with any read aligner capable of sam format output without the need to store bulky alignment files to disk. I show how the predictions are robust even with older and less [base] accurate WGS nanopore datasets and relatively low (10X) sequence coverage and present a step-by-step protocol to predict HLA class I and II genes from the long sequencing reads of modern third-generation technologies.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源