CAMBRIDGE, UK, 3th May 2018 – PetaGene, the maker of award winning genomics data compression solution, today launched PetaSuite Cloud Edition, a tool that combines two innovations: (i) the ability for a user’s software tools and pipelines to seamlessly integrate with a wide variety of cloud platforms without modification, and (ii) significantly improved, high-performance, scalable PetaSuite genomic compression technology.
For example, users can now directly run, without modification, their custom BWA-mem, GATK, Python, Java, shell scripts, and other POSIX-based software/pipelines streaming directly to/from AWS, Google Cloud, Azure, and private cloud storage, as though they were local filestores. PetaSuite CE supports each platform’s object encryption during transfer and at rest. User applications can connect to multiple cloud platforms, buckets and regions as desired, transparently, and on demand, in user-mode, without needing to modify their pipelines, setup mounts, or have administrator privileges.
Whether running on bare-metal, in VMs, or within Docker containers, for public, private or hybrid cloud, PetaSuite CE enables organizations to unlock the power of distributed object storage seamlessly from their POSIX-compliant tools and pipelines.
PetaSuite CE is built from the ground-up for the extremely high performance streaming and random-access workloads demanded by genomics applications. The integrated, transparent PetaGene compression has been significantly improved to deliver even faster compression and greater reductions of up to 6x of both BAM and FASTQ.GZ files, enabling large costs savings in cloud storage and data transfer times. Moreover, PetaGene compression can also preserve the MD5 checksum of the original BAM or FASTQ.GZ file and not just the internal raw SAM/FASTQ data.
PetaGene software addresses challenges caused by growing volumes of genomics data. Developed by an award-winning team from the University of Cambridge, PetaGene grew out of a project exploring new storage and compression approaches in collaboration with the European Bioinformatics Institute. It achieves up to a 6x reduction in both storage costs and data transfer times compared to BAM and gzipped FASTQ files – this is a 96% reduction compared to raw FASTQ files. It transparently integrates with existing storage infrastructure and bioinformatics pipelines.