Genetics is changing the future of human beings. For prenatal testing of genetic birth defects, tumor genetic testing, genetic research of viruses and bacteria, and so on, we can find out the "culprit" through gene sequencing .

Therefore, the gene sequencing industry is ushering in a big era of development. According to relevant statistics, the compound growth rate of gene sequencing from 2007 to 2013 is 33.53%. The global market size is only 8 million US dollars in 2007, and will reach 2018. About $11.7 billion indicates that the market for gene sequencing has matured.
Today, gene sequencing technology has been listed as a national key development industry. According to research, the annual gene sequencing analysis will grow by more than 30%, and the amount of data will be more and more. How to transmit, store and manage massive amounts of genetic data is a very difficult problem. Therefore, HPC is widely used in genes. Sequencing industry.

Young Novo source, how to lead the genetic sequencing industry

The gene sequencing industry is a slogan industry, and it is also filled with a large number of new and old players. At the same time, gene sequencing is a rigorous industry. Only by holding a scientific and reasonable starting point, we can continue to try and make mistakes and explore progress before we can finally win. Nuohe Zhiyuan is clearly moving along this road.

In the field of domestic gene sequencing, Nuohe Zhiyuan is an iconic enterprise. As a leader in the field of gene sequencing in China, Nuohe Zhiyuan's business covers three major fields: technology services, tumor gene detection and genetic testing, providing research universities, research institutes, hospitals, pharmaceutical R&D companies, and agricultural enterprises. Services such as gene sequencing, mass spectrometry and bioinformatics support.

Founded in March 2011, Nuohe Zhiyuan initially focused on technology services. In 2012, Nuohe Zhiyuan began to expand its tumor genetic testing services. Up to now, Nuohe Zhiyuan has covered three major areas of technology services, tumor gene detection and genetic testing.

In fact, the gene sequencing industry is a knowledge-intensive industry. There are two criteria for measurement, one is the contribution to genetics, and the other is the availability of advanced genetic sequencers.

So first, in terms of the contribution of genetic scholarship. As of June 2018, Nuohe Zhiyuan cooperated with project partners to publish more than 330 SCI articles, with cumulative impact factors greater than 2120; currently, 115 software copyrights have been obtained, and 49 independent research and development patents have been obtained.

Secondly, currently Novo Source has operated 25 NovaSeq, 20 PacBio Sequel, 30 HiSeq X, 11 HiSeq 2000/2500/4000, 4 MiSeq, 4 NextSeq 500, 6 Life Ion Proton (DA8600) worldwide. ), two S5XL and five sets of Q Exactive TM HF-X and other state-of-the-art gene sequencers have established the largest gene sequencing platform in Asia, which will achieve ultra-high throughput of 280,000 person-wide sequencing. At the same time, the country introduced the Q Exactive TM HF-X high-end mass spectrometer platform to create the most advanced bio-mass spectrometry center, providing customers with comprehensive and in-depth multi-omics solutions.

In addition to these two key points, the ecological construction of Nuohe Zhiyuan has been perfected, with partners all over the world, including more than 1920 research institutes and universities, more than 720 hospitals, more than 1,430 pharmaceutical and agricultural enterprises. The company's dream of Novo Zhiyuan is to become the world's leading provider of genomics products and services.

After the algorithm and data, how do the three major bottlenecks of the calculation break?

The core asset of gene sequencing is the huge amount of data generated by the gene sequencer. Therefore, as the throughput of gene sequencing increases, the industry produces more and more data, and the ability to store and calculate platforms. Higher requirements are also raised.

The so-called high-throughput gene sequencing is to analyze the sequencing characteristics of biological DNA analysis by sequencing technology, including sequence mapping construction, sequence alignment, mutation detection and other high-performance calculations. Especially in human health research, it is necessary to understand the structure, function, interaction and relationship between various human diseases, and to seek various treatment and prevention methods, including drug treatment. Drug design based on biological macromolecular structure and small molecular structure, and the like.

Therefore, a large amount of software is used in the process of bioinformatics, such as SOAPDenovo for sequence assembly, ALLPATHS-LG, Falcon, Trinity, etc.; sequence-oriented comparison: BWA, Blast, bowtie2, etc.; Sequence-oriented analysis: CLUSTAL, HMMER Etc; phylogenetic tree analysis: PHYLIP, TreeBest, MrBayes, etc.

At the same time, biological algorithms are gradually maturing, and biological data flux is also rapidly increasing, which will inevitably lead to comprehensive optimization of analysis software and processes, so computing power has become the biggest bottleneck faced by the precision medical industry. For Norfo Zhiyuan, the demand for HPC also faces many challenges.

First, the amount of data is huge. It is precisely because the amount of data generated by the gene sequencer is very large, which requires that massive storage must be configured in the HPC system to satisfy the storage of the sequencing data.

Second, the demand for memory is large. In the sequence comparison or splicing stage, you need to load massive data into memory and process it at one time. If the memory is not enough or the performance is not good, it may not be possible to compare or calculate the next step. Therefore, we recommend for biological information. To learn the application environment, you need to configure fat nodes or large memory nodes to meet data loading and analysis, and fully improve work efficiency.

Third, the amount of calculation is large. For different bioinformatics programs, the requirements for CPUs based on different algorithms are not the same, but the overall calculation amount is very large, some support parallel, some software is calculated in a single node, in short, and other high Performance calculation applications are similar, and bioinformatics calculations are also CPU intensive.

Obviously, if there is long-term stable support from HPC, it will help Novo to source the future development. After many investigations, Novo Zhiyuan chose Lenovo HPC as a service provider. How does Lenovo meet the needs of Novo's source?

Using HPC to build the bottom, the power behind Lenovo

As the leader of the domestic HPC, Lenovo first made a careful analysis of the problem of Novo's source. Lenovo believes that the core content of bioinformatics computing is memory-intensive, storage-intensive, combined with Lenovo's years of experience, provides a specialized solution for Norfo Zhiyuan.

It mainly solves the four problems of Novo's source of high performance, memory, storage and stability.

First of all, for high-performance computing, on the one hand, floating-point processing performance, on the other hand, the overall performance of the CPU itself, Lenovo combined with the characteristics of the bioinformatics industry, it is recommended to use Intel processors, not only achieve high processing performance. And Intel has great advantages in energy efficiency, memory support, and the architecture of the CPU itself.

Secondly, in bioinformatics applications, the loading of advanced data requires more and more memory capacity. Lenovo uses a four-channel or eight-way fat node for large memory servers, which can be configured with up to 2TB of memory in a single node. Actual needs.

Third, mass storage systems are a prerequisite for bioinformatics computing. Lenovo can not only provide professional-grade direct-attached storage, but also build parallel file systems or distributed storage systems through proprietary storage nodes, access Ethernet, and even 40GB/56GB Infiniband networks, with an overall capacity of PB. In addition, considering the user's data security, data backup, etc., it fundamentally solves the data storage problem of bioinformatics.

Finally, a high-stability system can make our bioinformatics application more convenient and efficient, and also can process data efficiently and ensure uninterrupted business. Lenovo through unified cluster monitoring management, job scheduling, combined with Lenovo high performance The server guarantees the stability of the whole system from various aspects, greatly improves the user's use stability and reduces the failure rate, and provides continuous and uninterrupted support for users to improve productivity.

It is understood that Lenovo's high-performance computing system provides nearly 200 trillion calculations and allocates more than 10PB of storage space. This set of clusters is very good for carrying the business of Novo Zhiyuan East China, effectively improving the company's limited computing resources in East China.

Now Nuohe Zhiyuan has the world's leading high-performance computing platform, the data center computing capacity has been increased to 1727T flops, total memory 410TB, total storage 60.2PB, effectively supporting the two major areas of life science research and medical health The need for data analysis and storage. On the road to exploring the future of genetics through high-performance computing, Lenovo HPC has always been the most trusted technical service provider of Novo.

Squid Flower

Squid Flower,Frozen Calamar Flower,Frozen Octopus Flower,Frozen Pineapple Cut Squid Flower

Zhoushan Haiwang Seafood Co., Ltd. , https://www.haiwangseafoods.com