Start solving your data challenges at ASHG 2018 in San Diego

Posted on September 6, 2018
ASHG 2018 graphic - see us on booth 819
ASHG 2018 Banner - see us on booth 819

Every year the ASHG annual meeting attracts the thought leaders in the field of human genetics. Will you be among them at ASHG 2018 in San Diego? If so, visit us on booth #819 during the three days of the exhibition, October 17th to 19th, to let us know about the data challenges you face.

Our team has extensive experience in health information technologies, systems development and large-scale genomics. We’ll be delighted to discuss how to address the frustrations caused by ever increasing volumes of genomic data.

Those challenges go beyond the simple cost of on-premises or cloud storage. There’s the transfer and analysis times for large data sets to consider too. Speeding those up can give a great boost to your research. Our PetaSuite compression software integrates seamlessly with analysis pipelines, reduces I/O demand and facilitates collaboration.

We’ll also be hosting a session in CoLab Theater 3 on Thursday 18th from 4:00 to 4:15pm to present on how using appropriate compression technology can benefit commercial and research organizations working with genomic data.

If you’re attending ASHG 2018 in San Diego, drop by booth #819 or book a meeting in advance of the show to make sure we fit into your schedule. We look forward to seeing you there.

If you’d like to know more about the ASHG meeting or register to attend, visit the website.

PetaGene at the Front Line Genomics Special Interest Group event

Posted on August 28, 2018

Michael Hultner, SVP Strategy and General Manager, US Operations at PetaGeneMichael Hultner, our SVP Strategy and General Manager, US Operations recently attended a Special Interest Group event (SIG) organised by FrontLine Genomics. SIGs bring together senior-level research, clinical and business professionals from across the genomics community to discuss relevant issues and work towards finding solutions to common problems.

In the session on data security, privacy and consent; the subjects of sharing datasets and security provided opportunities to explain how compression technology can help.  In this blog post Michael shares his insights on how access, safeguarding and cloud storage security relate to compression of genomic data.

Sharing datasets

  • Accessing data is difficult for researchers and can take a long time.

    Lack of easy access to a dataset, or information about it, is a significant reason why research projects are time-consuming. The size of the files is a major factor in this. Genomic files can present challenges which regular data storage systems are not set up to solve. It is possible to store data using compression formats which take into account the specific nature of genomic data. This makes life easier for researchers by decreasing transfer and access times. It is also possible to speed up analysis thanks to lower I/O demands.


  • Data is best protected using standard safeguards.

    While compressing data by itself doesn’t make the files any more or less secure, the benefits of compression can help to enable better security or make adopting best practice simpler. Requiring researchers to travel to where the data is stored in order to access it is a common approach for data stored on-premises.  This means that the organisation holding the data cannot enjoy the benefits of cloud storage. It also places demands on the individual researcher and their institution, whether academic or commercial, that might not be practical.  Compressing genomic data using appropriate tools gives the flexibility to enable data sharing and collaboration without exposing it to avoidable security risks.


  • There are still many misconceptions about the security of the cloud.

    Security worries are the reason why some research institutions store their data on their own hard drives. These are then transported to individual laboratories. In the age of GDPR and protected health information, the thought of hard drives containing genomic datasets being transported by individual researchers is probably enough to give data stewards sleepless nights. Despite developments in hard drive technology, it’s an impractical approach for today's genomic datasets. A better technique would be to use established data storage solutions in the cloud or on-premises. That approach allows appropriate access and sharing protocols to be set up as well as suitable backup and restore options should the worst happen.   In this case, compression reduces the cost of these established storage solutions. And if the right kind of compression is used, there is no need to change existing pipelines or bioinformatics systems.

If you’d like to know more about how PetaGene can help with your genomic data management, use the contact form on the site or contact Michael at

Second win at Bio-IT World.

Posted on June 12, 2018
Bio-IT World Best of Show 2018 winner logo
The PetaGene team celebrating their win
From left: Vaughan Wittorff, Co-Founder & Chief Commercial Officer; Dan Greenfield, Co-Founder and CEO; Mar Sanchez Gadea, Business Support Administrator and Michael Hultner SVP Strategy & GM US Operations

We launched PetaSuite Cloud Edition (CE) at Bio-IT World 2018, the premiere conference for IT in the Life Sciences.  Its benefits for organisations working with genomic data in terms of reduced storage cost, shorter data transfer times and quicker analysis were immediately recognised by the judges of the Best in Show awards; they awarded it top prize in the storage infrastructure and hardware category.  This is the second time we have won this influential award.  The original version of PetaSuite picked up Best of Show in the category for optimising speed and storage at the same event in 2016.

In her award citation for PetaSuite CE, Allison Proffit, Bio-IT World Editor, said: “the judges were very impressed with an offering that lets users access data compression, objects in storage or an s3 bucket, all from the command line.”

Petasuite CE allows a user’s software tools and next generation sequencing (NGS) pipelines to seamlessly integrate with a wide variety of cloud storage platforms without modification.  Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), private cloud and hybrid cloud are all supported transparently.

PetaSuite CE also delivers significantly improved, high-performance, scalable genomic compression technology.  Lossless compression ratios of up to 10x for GATK BAM files are now possible. Other NGS file types such as FASTQ.gz can now also achieve much better compression ratios.  For FASTQ.gz files, the ratio is up to 4.3 times compression.  This represents a potential space saving of 77%, which brings dramatic cost and transfer time reductions without compromising the quality of the original genomic data.

Dan Greenfield, our co-founder and CEO said: “We are honoured to win this prestigious award for the second time. The fact that the judges acknowledged the importance of our seamless cloud integration, with its implications for scaling and collaboration, was particularly pleasing.  We will continue to strive to create solutions which speed up cooperation and analysis for our research and diagnostic customers.”

Find out more about PetaSuite Cloud Edition on the products page or get in touch via our contact page.

Everything you need to know about compression for genomic data

Posted on May 3, 2018

The cost of storing and transferring genomic data is rapidly growing as more and more of it is produced. According to researchers from the University of Illinois at Urbana-Champaign and Cold Spring Harbor Laboratory, genomic data from Next Generation Sequencing (NGS) will outstrip all other forms of big data for the sheer volume of size, yielding even more data than the current champions of big data volume.

Save time and money by compressing genomic data files
  • What options exist to make costs and transfer times more manageable for genomic data files?
  • Does storing data in the cloud present its own specific problems?
  • Should bioinformaticians and IT professionals be looking for benefits beyond merely reducing storage costs and the time taken to transfer genomic data files?
  • Are all compression techniques created equal? Or do compression techniques developed specifically for genomic data yield enhanced benefits?
  • What are the innovations and developments in genomic compression that have made life easier for those working with NGS data?

Read this paper from Frontline Genomics to discover everything you need to know about genomic data compression:

Our Story

Posted on April 5, 2018
PetaGene started as a small hint of an idea: that a team of Cambridge University PhDs could together devise a novel approach to the problem of genomic data storage. Continue reading "Our Story"