官术网_书友最值得收藏!

Processing NGS data with HTSeq

HTSeq (https://htseq.readthedocs.io) is an alternative library that's used for processing NGS data. Most of the functionality made available by HTSeq is actually available in other libraries covered in this book, but you should be aware of it as an alternative way of processing NGS data. HTSeq supports, among others, FASTA, FASTQ, SAM (via pysam), VCF, GFF, and Browser Extensible Data (BED) file formats. It also includes a set of abstractions for processing (mapped) genomic data, encompassing concepts like genomic positions and intervals or alignments. A complete examination of the features of this library is beyond our scope, so we will concentrate on a small subset of features. We will take this opportunity to also introduce the BED file format.

The BED format allows for the specification of features for annotations tracks. It has many uses, but it's common to load BED files into genome browsers to visualize features. Each line includes information about at least the position (chromosome, start and end) and also optional fields such as name or strand. Full details about the format can be found at https://genome.ucsc.edu/FAQ/FAQformat.html#format1.

主站蜘蛛池模板: 平安县| 洪雅县| 白山市| 永川市| 宾川县| 岗巴县| 万州区| 铅山县| 山西省| 永春县| 伽师县| 洛宁县| 北票市| 阿合奇县| 石屏县| 合肥市| 左贡县| 七台河市| 芦山县| 大田县| 贺州市| 阿城市| 元氏县| 科尔| 玛纳斯县| 包头市| 得荣县| 家居| 腾冲县| 桃园县| 会昌县| 获嘉县| 梅州市| 山西省| 连山| 启东市| 吉首市| 舟曲县| 晋城| 遂宁市| 永仁县|