Why is there a need of a cloud infrastructure for genomics?
In Germany, no single university or research center currently has the necessary infrastructure to perform analyses with large datasets currently becoming available in the life sciences, and to store and access these data securely. Further, lack of standardization in computational analysis workflows renders data processed in different institutions essentially non-comparable. We thus propose a model of cloud computing involving pooled resource utilization through responsible sharing of IT infrastructure and pre-defined services to facilitate engaging non-experts. This Genome-Cloud will serve as a cloud for high-throughput data in the life sciences in Germany. The development of the Genome-Cloud would follow recommendations by the Leopoldina, cautioning that Germany can only remain competitive by strategically setting up a national “omics” and IT infrastructure linking universities with non-university institutions, to bundle expertise in interdisciplinary research.
What advantages does the Genome-Cloud have for researchers?
The Genome-Cloud will have several key advantages:
- The Genome-Cloud will make genomic analysis with state-of-the-art tools widely accessible (to experts as well as non-experts), providing bioinformatics processing capabilities to numerous users in Germany.
- Pre-configured pipelines and state-of-the-art computing infrastructure will become available to the German community through the Genom-Cloud, to facilitate state-of-the-art genomic analyses (Software-as-a-service, SaaS, and Infrastructure-as-a-service, IaaS, models).
- The use of standardized analysis pipelines enabled by the Genome-Cloud will additionally facilitate integrative analyses and meta-analyses, by improving the comparability of datasets generated at different institutions.
- Through standardized data access control and centralized data storage, the Genome-Cloud will further yield improved data protection, and, once widely applied will reduce the need to duplicate commonly used datasets
- Resource sharing, and the avoidance of duplication of infrastructure, will eventually lead to reduction in overall infrastructure and operational costs.
Are the data save and how will the data be protected?
The Genome Cloud’s data security plan, and especially the protection of data in the cloud, will comply with stringent German data protection regulations and standards. Standardized data access control and centralized data storage will confer improved data protection. Furthermore, once in use the Genome-Cloud will decrease the need for dataset duplication (reducing this specific data security risk).
Will the Genome-Cloud focus only on a particular type of research, such as cancer research?
Although the Genome-Cloud we will be initially set up with cancer datasets, for which given the amount of data already currently available the present need is particularly high, we will ensure utility across all areas of the life sciences.
Are there more broadly positive aspects of the Genome-Cloud?
The development of the Genome-Cloud would follow recommendations by the Leopoldina, cautioning that Germany can only remain competitive by strategically setting up a national “omics” and IT infrastructure linking universities with non-university institutions, to bundle expertise in interdisciplinary research. Besides this, the implementation of our Genome-Cloud model may also have positive effects on economic development. Strong commitments in Germany into cloud computing will ensure international competitiveness, improve job security, and form incentives for biotech industries.
Who will be users of the Genome-Cloud?
The Genome-Cloud will make genomic analysis with state-of-the-art tools widely accessible, providing bioinformatics processing capabilities to numerous users in Germany. Amongst the users will be bioinformaticians, clinicians as well as omics- and data-scientists.
- Bioinformaticians, computer scientists and software engineers will develop new computational workflows (or improve existing ones) to facilitate data analysis and interpretation using the Genome-Cloud. Analysts will be able to employ custom analysis workflows to process genetic data stored within the cloud, using the Genome Cloud’s Infrastructure-as-a-Service (IaaS) model.
- Life scientists, clinicians, diagnostic laboratories, and members of biotech/pharmaceutical industries will make use of the Genome Cloud’s Software-as-a-Service (SaaS) model to employ pre-configured analysis pipelines. Data analysis may be accompanied by the upload of new data, and will benefit from the use of data previously available at the Genome-Cloud.
- Basic researchers or clinicians will be able to consult the Genome Cloud’s data portal to investigate existing preprocessed data.