Lab


Lab

GAO GE LAB

Computational Genomics

Center For Biolnformatics, School of Life Sciences, Peking University

Research Interests (研究兴趣)

As biology is increasingly turning into a data-rich science, massive data generated by high-throughput technologies pose both opportunities and serious challenges. My team is interested primarily in developing novel computational techniques to analyze, integrate and visualize high-throughput biological data effectively and efficiently, with application to decipher the function and evolution of gene regulatory system. Drawing on my background in computational sciences, my team specialize in large-scale data mining, using a combination of statistical learning, high-performance computing, and data visualizing.

随着以深度测序为代表的高通量生物技术在生命科学领域的广泛应用,各种生物学大数据以指数增长大量涌现。这些数据之中蕴藏着大量的宝藏,即生物学的新规律、新发现。但是,这些海量的、指数增长的、并且高噪声的生物数据也带来了巨大的数据分析技术上的挑战。

课题组以生物信息学分析技术、方法与平台开发为基础,通过综合运用大数据与统计学习(statistical learning)等计算方法,整合高通量遗传学与功能基因组学数据,探索新表达调控因子的功能与演化及其对生物体新性状和新功能的贡献。目前课题组主要研究方向包括1) 非编码RNA对干细胞命运决定过程的调控、与2) 基因组中适应性基因获得/丢失对调控网络演化的影响。

Recent Peer-Reviewed Publications (近期发表论文)

Handle Biological “BIG DATA” Effectively and Efficiently (生物学大数据的整合与挖掘)

1.       Cheng S. J., Shi F. Y., Liu H., Ding Y., Jiang S., Liang N., Gao G. 2017. Accurately annotate compound effects of genetic variants using a context-sensitive framework. Nucleic Acids Res. (in press)

2.       Hou M., Tian F., Jiang S., Kong L., Yang D., Gao G.* 2016a. LocExpress: a web server for efficiently estimating expression of novel transcripts. BMC Genomics 17(13): 175-179. (Featured as Best Paper at InCoB’16)

3.       Hou M., Tang X., Tian F., Shi F., Liu F., Gao G.* 2016b. AnnoLnc: a web server for systematically annotating novel human lncRNAs. BMC Genomics 17(1): 931.

4.       Hu B., Jin J., Guo A. Y., Zhang H., Luo J., Gao G.* 2015. GSDS 2.0: an upgraded gene feature visualization server. Bioinformatics 31(8): 1296-1297. (Featured as ESI Highly Cited (Top 1%) Paper)

5.       Xiao A., Cheng Z., Kong L., Zhu Z., Lin S., Gao G.*, Zhang B.* 2014. CasOT: a genome-wide Cas9/gRNA off-target searching tool. Bioinformatics 30(8): 1180-1182. (Featured as ESI Highly Cited (Top 1%) Paper)

6.       Kong L., Wang J., Zhao S., Gu X., Luo J.*, Gao G.* 2012. ABrowse - a customizable next-generation genome browser framework. BMC Bioinformatics 13: 2. (Featured as “Highly Accessed”)

7.       Wang J., Kong L., Zhao S., Zhang H., Tang L., Li Z., Gu X., Luo J.*, Gao G.* 2011. Rice-Map: a new-generation rice genome browser. BMC Genomics 12: 165. (Featured as “Highly Accessed”)

Decipher the Function and Evolution of Gene Regulatory Network (调控网络的功能与演化)

1.       Jin J., Tian F., Yang D. C., Meng Y. Q., Kong L., Luo J., Gao G.* 2017. PlantTFDB 4.0: toward a central hub for transcription factors and regulatory interactions in plants. Nucleic Acids Res 45(D1): D1040-D1045.

2.       Jin J., He K., Tang X., Li Z., Lv L., Zhao Y., Luo J., Gao G.* 2015. An Arabidopsis Transcriptional Regulatory Map Reveals Distinct Functional and Evolutionary Features of Novel Transcription Factors. Mol Biol Evol 32(7): 1767-1773.

3.       Zhao Y., Tang L., Li Z., Jin J., Luo J., Gao G.* 2015. Identification and analysis of unitary loss of long-established protein-coding genes in Poaceae shows evidences for biased gene loss and putatively functional transcription of relics. BMC Evol Biol 15: 66. (Featured as “Very Good (being of special significance in its field)” by Faculty of 1000)

4.       Gao G., Vibranovski M. D., Zhang L., Li Z., Liu M., Zhang Y. E., Li X., Zhang W., Fan Q., VanKuren N. W., Long M., Wei L. 2014. A long-term demasculinization of X-linked intergenic noncoding RNAs in Drosophila melanogaster. Genome Res 24(4): 629-638.

5.       Jin J., Zhang H., Kong L., Gao G.*, Luo J.* 2014. PlantTFDB 3.0: a portal for the functional and evolutionary study of plant transcription factors. Nucleic Acids Res 42(1): D1182-1187. (Featured as ESI Highly Cited (Top 1%) Paper)

6.       Tang X., Hou M., Ding Y., Li Z., Ren L., Gao G.* 2013. Systematically profiling and annotating long intergenic non-coding RNAs in human embryonic stem cell. BMC Genomics 14(Suppl 5): S3. (Featured as “Highly Accessed”)

7.       Chen Z. X., Zhang Y. E., Vibranovski M., Luo J., Gao G.*, Long M.* 2011. Deficiency of X-linked inverted duplicates with male-biased expression and the underlying evolutionary mechanisms in the Drosophila genome. Mol Biol Evol 28(10): 2823-2832.