Journal article
CGAT-core: a python framework for building scalable, reproducible computational biology workflows
- Abstract:
- In the genomics era computational biologists regularly need to process, analyse and integrate large and complex biomedical datasets. Analysis inevitably involves multiple dependent steps, resulting in complex pipelines or workflows, often with several branches. Large data volumes mean that processing needs to be quick and efficient and scientific rigour requires that analysis be consistent and fully reproducible. We have developed CGAT-core, a python package for the rapid construction of complex computational workflows. CGAT-core seamlessly handles parallelisation across high performance computing clusters, integration of Conda environments, full parameterisation, database integration and logging. To illustrate our workflow framework, we present a pipeline for the analysis of RNAseq data using pseudo-alignment.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Version of record, pdf, 607.1KB, Terms of use)
-
- Publisher copy:
- 10.12688/f1000research.18674.2
Authors
- Publisher:
- F1000Research
- Journal:
- F1000Research More from this journal
- Volume:
- 8
- Article number:
- 377
- Publication date:
- 2019-04-04
- Acceptance date:
- 2019-04-04
- DOI:
- ISSN:
-
2046-1402
- Language:
-
English
- Keywords:
- Pubs id:
-
pubs:997713
- UUID:
-
uuid:467be341-ca5e-4b95-b313-97fa70ad102e
- Local pid:
-
pubs:997713
- Source identifiers:
-
997713
- Deposit date:
-
2019-05-12
Terms of use
- Copyright holder:
- Cribbs et al
- Copyright date:
- 2019
- Notes:
- © 2019 Cribbs AP et al. This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
- Licence:
- CC Attribution (CC BY)
If you are the owner of this record, you can report an update to it here: Report update to this record