The declining cost of sequencing a human genome (in some instances, for as little as $1,000 per sequence) is creating a growing demand for tools that can sort, organize, analyze, and store the ever-increasing amount of biological data. If properly harnessed, data can be used for academic research and by pharmaceutical and diagnostic industries. Bioinformatic algorithms address this issue: they permit scientists to encode specific information, align the data, and analyze for gene expression, all which lead to actionable insights.
Yes, pure bioinformatic investment opportunities are scarce. Tools often are open-source (and free), and genetic databases can be found online (for free).
That said, investors can participate indirectly in the dynamic bioinformatic space. The computing power, storage, and analytics required to research and analyze genomic data, are orders of magnitude greater than what we have seen thus far during the age of computing. As interpolated from the graph below, it would take almost 60 million Apple [AAPL] 2014 iMacs to store all of the genomes sequenced in a year by just one HiSeqX10 (Illumina Inc.’s [ILMN] most powerful sequencer). Flash-based storage companies, such as Fusion-io [FIO] [soon to be acquired by Sandisk [SNDK]], Nimble Storage, Inc. [NMBL], and Pure Storage are part of the solution to this challenge.
Other beneficiaries include Illumina, Pacific Biosciences of California [PACB], and Compugen Ltd. [CGEN]. Illumina and Pac Bio are sequencing companies, while Compugen is an early stage pharmaceutical company focused on drug discovery through in silico (computer generated) bioinformatic algorithms.
Illumina, which has 90% market share of all the base pairs of DNA sequenced in the world today, also incorporates analytics into its hardware platform. It hosts a cloud-based storage and analytics service called BaseSpace, allowing customers to upload and share their genetic data. BaseSpace also provides apps which can analyze user data with various bioinformatic algorithms.
Pacific Biosciences (PacBio) produces large, research-grade sequencers, which play an important role in sequencing longer strands of DNA. They are more suitable for discovery than later stages of R&D. PacBio’s machines are relatively expensive, limiting its addressable market. Yet, sequencing long read lengths enables the sequencing of multiple genes and epigenetics research.
Compugen has created a robust bioinformatics platform, combining biology and in silico computation to predict and identify immune checkpoint inhibitors (ICIs). Research around ICIs is leading to a period of incredible discovery of drug targets, which could reverse the progression of cancer and autoimmune diseases. Compugen’s platform allows it to find these targets faster and at lower costs than can many pharmaceutical companies. For example, from 2009 to 2013 Compugen spent $41 million on research and development and discovered nine targets. In contrast, large pharmaceutical companies frequently spend $200 million on only the first of three phases of drug development.