Cost & Time Savings

Storage cost and data transfer times
reduced by 60-90%


Lossless Streaming Compression

Full validation and MD5 matching


FIPS 140-2 compliant AES-256
regional encryption


Transparent Usage

Access compressed files in their original format. Your tools and pipelines won’t even know that anything has changed.


Speeds up Analysis

Reduces I/O which dominates performance

Audit data use

Searchable cryptographic ledger of
how the data is accessed and used.


No lock-in

Free updates and decompression tools. Distribute your compressed data to others.


Easy IT Deployment

Software and tools are user mode. No security issues. No sysadmin headaches.

Want to know how much you can save?

Award Winning Innovation

PetaGene has won “Best of Show” at Bio-IT World, the premier conference for IT in the Life Sciences, three times. In 2016, PetaSuite won against 46 competitors, in the category for optimising speed and storage. In 2018, PetaSuite Cloud Edition (CE) won in the infrastructure and hardware category and in 2019 PetaGene’s latest security innovation: PetaSuite Protect won the ‘Nailed It’ award against 30 competing products.

“The judges chose a new product that could give you millions of dollars worth of storage savings right now, a product that several of our judges wanted to go buy immediately after lunch.”

Allison Proffitt

Boston, USA Editorial Director of Bio-IT World

Bio-IT World Best of Show Winner logos


"By using PetaSuite compression software for our data we have achieved our primary aim of dramatically increasing our storage capacity. This means that we do not need to spend precious resources on replacing or adding to it. The PetaGene team were responsive to our needs, including managing the demands of using IGV to efficiently access the compressed data via Apache server without decompressing the data first.

Per Sikora, Head of Facility

Per Sikora

Gothenburg, Sweden
Head of Facility, Clinical Genomics Gothenburg

“We were looking for a reliable NGS compression solution that we could quickly deploy at scale on our large cluster and would allow us to reduce our tier 1 storage needs. We were under time pressure to decide on a solution for funding reasons and PetaGene was willing to go the extra mile to help us. We decided to go with PetaGene as they offer transparent on-the-fly decompression and we estimated that there would be an overall cost saving compared to other solutions.”

Dr Christophe Trefois of Luxembourg Centre for Systems Biomedicine

Dr Christophe Trefois

Belvaux, Luxembourg
Technical Specialist at Luxembourg Centre for Systems Biomedicine, University of Luxembourg

“Handling the enormous amount of data we receive from genome sequencing is a huge challenge in our group as we analyse data from more than 10,000 human genomes... PetaGene’s solutions allow us to easily store, use and visualise the sequencing data at a fraction of the cost.”

Dr Chris Penkett Head of Pipelines for the 10K NIHR Rare Disease Genomes Project

Dr Chris Penkett

Cambridge, United Kingdom
Head of Pipelines for the 10K NIHR Rare Disease Genomes Project NHS Blood and Transplant & University of Cambridge

Customers and Collaborators

Table showing size of files created using Fastq.gz, bam, cram and PetaGene compression

How Does it Work?

  • PetaGene supplies multi-threaded Linux software (PetaSuite) for you to use to losslessly compress your BAM and FASTQ.gz files for savings of between 60% and 90%, whether on-premises on in the cloud.
  • You never need to decompress the files - our software comes with a user-mode shim (PetaLink) that does efficient random-access on-the-fly decompression out of memory so that the files appear with their original filenames in their original format. Performance is improved by doing this, due to I/O savings.
  • The Cloud Edition of PetaSuite even allows you to transparently migrate your pipelines to the cloud and/or access remote data as if it is local without downloading it first.

Join us at ISC High Performance 2022 (June 2022)

Join us at the ISC High Performance 2022 in Hamburg PetaGene will be exhibiting at ISC-HP 2022 in Hamburg from 29th May to 2nd June. The event returns to Hamburg after 10 years - just in time for our new product announcements! Come by Booth C305 to meet the PetaGene team and learn about our …

Join us at Bio-IT World Conference (May 2022)

Join us at the 20th annual Bio-IT World Conference & Expo PetaGene will be attending the 20th annual Bio IT World Conference and Expo from the 3rd to the 5th of May 2022 in Boston. It’s a great opportunity to meet, tell us about the challenges you face when storing and working with NGS genomic …

PetaGene’s customers have compressed over 3 million genome files

PetaSuite compression savings continue to grow We are pleased to announce the reaching of another landmark: PetaGene’s customers have now compressed over three million genome files. As genomic data sets continue their rapid growth, PetaGene customers across the full range of the genomic research and applications leverage PetaSuite's high compression ratios to limit their storage …

Case Study: Top 3 U.S. Children’s Hospital Deploys PetaSuite

Deploying PetaGene’s lossless compression for genomic data at a premier Children’s Hospital in the United States. View the full case study Understanding the genomic information of a patient is key in diagnosing a plethora of genetic and rare diseases. As a result, genome sequencing approaches such as Whole Exome Sequencing (WES) and Whole Genome Sequencing …

Architecting IT puts PetaGene in a Data-Centric Spotlight

Architecting IT's Chris Evans, co-host of the Storage Unpacked podcast, takes to his blog to focus a spotlight on PetaGene's technology. In his ongoing series on data-centric architectures, in this article Chris takes a look at PetaGene’s PetaSuite Cloud and Protect platform as another way to securely access content and accelerate remote access. Read more …

Managing NGS Data, a Dell and PetaGene healthcare podcast

Recently our co-founder Vaughan Wittorff and Phil Sweeney from Dell Technologies sat down to discuss how the use of Next-Generation Sequencing is expanding as the costs are coming down, creating an explosion of NGS processing and resulting data. Find out how PetaGene can address the demands of that scale of data, in a two-part Dell …

HISAT2 benchmarked with PetaGene’s compression and transparent readback tools

HISAT2 (Hierarchical Indexing for Spliced Alignment of Transcripts 2) is a graph-based read mapping tool for both DNA and RNA sequences.  HISAT2 enables a fast search through its graph index, mapping reads to the entire human genome along with a large number of variants. Since it is a widely used tool, at PetaGene we have …

Technology Networks report on PetaGene’s insights on the challenges faced by Genome UK

PetaGene's Dan Greenfield and Vaughan Wittorff talk to Technology Networks regarding the recent Genome UK announcement and PetaGene's insights. Dan and Vaughan sat down to talk with science writers Ruairi J MacKenzie and Molly Campbell of Technology Networks to discuss Genome UK and our insights into the endeavour. Read more Click here to read the …

PetaGene’s customers have now compressed one million genome files

For PetaGene, the one million genome era is underway We are pleased to announce the reaching of a landmark: PetaGene’s customers have now compressed over one million genome files. The dramatic drop in the cost of sequencing genomes and the numerous applications of this data to tackle critical diseases such as cancer and rare diseases …

PetaGene – Genome UK considerations

Genome UK is an exciting and ambitious new strategy that builds upon the UK's world-leading excellence in genomics to build "the most advanced genomic healthcare system in the world". We applaud this initiative, and believe this will enable the NHS to leverage precision medicine to improve outcomes and reduce costs, significantly drive research into new …

NVIDIA and PetaGene Combine Genomic Technologies to Address Critical Analysis Bottlenecks

PetaGene and NVIDIA announce seamless integration of PetaGene’s PetaSuite tools as a standard part of NVIDIA Clara Parabricks Pipelines. PetaGene’s transparent compression reduces file sizes by 60-90%, and enables Parabricks Pipelines GPU-accelerated genome analysis to run 29% faster. Cambridge, UK, Oct. 6, 2020: PetaGene and NVIDIA today announce their integrated bioinformatics solution to accelerate genomic …

Join us at Biodata World Congress 2020

PetaGene will be attending the Biodata World Congress 2020 virtually from the 9th to the 12th of November... It’s a great opportunity to meet and tell us about the challenges you face when storing and working with NGS genomic data. Come along to discuss how our dramatic compression ratios, combined with the right storage architecture, …

Join us at the 19th annual Bio-IT World Conference

PetaGene will be attending the 19th annual Bio IT World Conference and Expo virtually from the 6th to the 8th of October... It’s a great opportunity to meet and tell us about the challenges you face when storing and working with NGS genomic data. This year we act as a Gold Sponsor to the event, …

Six signs that you could benefit from compressing your genomic data

Genomic data files, whether BAM or FASTQ.gz format, are large and make huge demands on IT infrastructure. But, how can you tell if the challenges you face can be solved by compressing your genomic data with PetaGene technology? Here’s a list of six scenarios where lossless compression with transparent read-back will help. 1. Your storage …

Alliance Global Appointed Exclusive PetaGene Distributor For Middle East, Africa and Central Asia

We are pleased to announce that PetaGene has signed an agreement appointing Dubai based Alliance Global (AGBL) as the exclusive distributor of our genomic data management software in Middle East, Africa, Central Asia, Pakistan, Bangladesh and Sri Lanka. The number of national population-scale genomics initiatives in the region is growing and AGBL is a leading …

Join us at Biodata World Congress 2019 in Basel

PetaGene will be attending the Biodata World Congress in Basel, Switzerland from 4th to 5th December.. It’s a great opportunity to meet and tell us about the challenges you face when storing and working with NGS genomic data. Come along to discuss how our dramatic compression ratios, combined with the right storage architecture, can help …

AstraZeneca deploys PetaSuite genomic data compression software in core genomics initiative

PetaGene’s PetaSuite compression software and cloud-computing solutions speed up data transfers and reduce storage costs for research projects involving genomics data. We are pleased to announce that Astrazeneca has selected PetaSuite software to compress the genomics data sets for AstraZeneca’s Centre for Genomics Research (CGR). Using genomics data and state-of-the-art methods for genomic analysis, the …

Princess Máxima Center for Pediatric Oncology chooses PetaSuite for genomic oncology data compression

We are pleased to announce that Princess Máxima Center for Pediatric Oncology, the largest pediatric cancer center in Europe, has chosen to use PetaGene’s transparent, lossless genomic data compression software, called PetaSuite, to reduce its data storage costs while accelerating access to the data. Next-generation sequencing plays an integral role in the Center’s diagnostics and research …

Genique Lifesciences appointed exclusive PetaGene distributor for India

We are pleased to announce that PetaGene has signed an agreement appointing Genique Lifesciences as the exclusive distributor of our genomic data management software in India. The agreement will allow India-based Genique Lifesciences to act as the exclusive sales channel for PetaGene’s genomic data compression software for the growing Indian market. Genique’s founding team has …

ASHG 2019 in Houston

We will be among the exhibitors (on booth #609) at the ASHG Annual Meeting 2019 in Houston, TX from 15th to 19th October. Each year the event attracts around 6,500 scientific attendees, plus 250 exhibiting companies. It’s the world’s largest gathering of human genetics professionals. The meeting provides a forum for the presentation and discussion …

New genomic data storage and analysis guide

One year ago, Frontline Genomics published Genomic Data 101, its guide to the technology and hardware landscape for genomic data storage and analysis. It proved a valuable primer for anyone looking to find out about compression and general management of genomic data. The data infrastructure to support genomic research, including compression, has evolved since the …

HIMSS Europe Conference 2019

PetaGene will be attending the HIMSS European Conference in Helsinki, Finland from 11th to 13th June. It’s a great opportunity to meet and tell us about the challenges you face when storing and working with NGS genomic data. Come along to learn more about how our dramatic compression ratios, combined with the right storage architecture, …

Case study: Optimizing genomic data storage for clinical research facilities

In clinical research, next generation sequencing (NGS) allows production of genomic data at an ever increasing rate. Storing genomic data effectively is critical, and while sequencing costs are falling, the cost of storage for the resulting files is increasing. As the amount of data sequenced grows, genomic data storage costs and transfer times can be …

Bio-IT World 2019 Best of Show winners

The latest addition to our product range, PetaSuite Protect, won “Best of Show” earlier this month at BioIT World Conference & Expo 2019, the premier conference for IT in Life Sciences. This is our third “Best of Show” win, previously winning in 2016 and 2018. This year, 31 new products were considered by an expert …

Solve your genomic data headaches at Bio-IT World

Bio-IT World Conference & Expo at the Seaport World Trade Center in Boston promises to be the biggest and best yet. Over 3,400 life sciences, clinical, healthcare, and IT professionals from over 40 countries are expected to attend from April 16th to 18th. Join us on booth #317 to talk about solving the challenges you …

PetaGene is hiring

Would you like to join a funded, award-winning and growing Cambridge start-up working in the increasingly vital field of genomic data?    We are looking for developers and a business support administrator.  For the developer roles you’ll need to be proficient in C/C++ and it would help if you’re comfortable working with algorithms. For the …

HIMSS Annual Conference and Exhibition

HIMSS 2019 PetaGene will be attending the HIMSS show in Orlando from 12th to 14th February. It’s a great opportunity to meet and tell us about the challenges you face when storing and working with NGS genomic data. Come to our presentation titled ‘Scaling Genomics Workloads for Precision Medicine’ on 12th February at 3:30pm. It's …

Why Do Community Driven Genomic Data Standards Matter?

Genomics is a community driven data science, with existing data standards. The ability to exchange data and share results relies on a small number of common file formats; and the software tools to read, process and generate data according to these conventions. Many of the common data formats are represented in flat text files; some …

Start solving your data challenges at ASHG 2018 in San Diego

Every year the ASHG annual meeting attracts the thought leaders in the field of human genetics. Will you be among them at ASHG 2018 in San Diego? If so, visit us on booth #819 during the three days of the exhibition, October 17th to 19th, to let us know about the data challenges you face. …

PetaGene at the Front Line Genomics Special Interest Group event

Michael Hultner, our SVP Strategy and General Manager, US Operations recently attended a Special Interest Group event (SIG) organised by FrontLine Genomics. SIGs bring together senior-level research, clinical and business professionals from across the genomics community to discuss relevant issues and work towards finding solutions to common problems. In the session on data security, privacy …

Second win at Bio-IT World.

We launched PetaSuite Cloud Edition (CE) at Bio-IT World 2018, the premiere conference for IT in the Life Sciences.  Its benefits for organisations working with genomic data in terms of reduced storage cost, shorter data transfer times and quicker analysis were immediately recognised by the judges of the Best in Show awards; they awarded it …

PetaGene, the maker of award winning genomics data compression solution, today launched PetaSuite Cloud Edition.

CAMBRIDGE, UK, 3th May 2018 – PetaGene, the maker of award winning genomics data compression solution, today launched PetaSuite Cloud Edition, a tool that combines two innovations:

Everything you need to know about compression for genomic data

The cost of storing and transferring genomic data is rapidly growing as more and more of it is produced. According to researchers from the University of Illinois at Urbana-Champaign and Cold Spring Harbor Laboratory, genomic data from Next Generation Sequencing (NGS) will outstrip all other forms of big data for the sheer volume of size, …

Our Story

PetaGene started as a small hint of an idea: that a team of Cambridge University PhDs could together devise a novel approach to the problem of genomic data storage.