- Scala Machine Learning Projects
- Md. Rezaul Karim
- 178字
- 2021-06-30 19:05:44
ADAM for large-scale genomics data processing
Analyzing DNA and RNA sequencing data requires large-scale data processing to interpret the data according to its context. Excellent tools and solutions have been developed at academic labs, but often fall short on scalability and interoperability. By this means, ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark and Parquet.
However, large-scale data processing solutions such as ADAM-Spark can be applied directly to the output data from a sequencing pipeline, that is, after quality control, mapping, read preprocessing, and variant quantification using single sample data. Some examples are DNA variants for DNA sequencing, read counts for RNA sequencing, and so on.
In our study, ADAM is used to achieve the scalable genomics data analytics platform with support for the VCF file format so that we can transform genotype-based RDD into a Spark DataFrame.
- 智能傳感器技術與應用
- 精通MATLAB神經網絡
- 商戰數據挖掘:你需要了解的數據科學與分析思維
- 一本書玩轉數據分析(雙色圖解版)
- Mobile DevOps
- 機器自動化控制器原理與應用
- Apache Superset Quick Start Guide
- HTML5 Canvas Cookbook
- DevOps Bootcamp
- R Machine Learning Projects
- Visual C++項目開發案例精粹
- 統計挖掘與機器學習:大數據預測建模和分析技術(原書第3版)
- 水晶石影視動畫精粹:After Effects & Nuke 影視后期合成
- 深度學習原理與 TensorFlow實踐
- 自適應學習:人工智能時代的教育革命