You are here: Home People Yongchao Liu, Dr.

Yongchao Liu, Dr.

Yongchao Liu, Dr.

Post-Doc (Wissenschaftlicher Mitarbeiter)

Room: 03-131
Johannes Gutenberg - Universität Mainz
Institut für Informatik
Staudingerweg 9

55128 Mainz, Germany

Office Phone: +49-6139-39-21013



Research Interests:

Parallel and Distributed Algorithm Design for Bioinformatics;

Heterogeneous Computing using General-Purpose GPUs;

High Performance Computing on Big Data


Research:

The following list my software algorithms associated with the publications. Three of my CUDA-based algorithms (i.e. CUDASW++, mCUDA-MEME and CUSHAW) have been rated as popular GPU-accelerated applications by NVIDIA, making me currently the champion in the application  list for Bioinformatics.

  • Next Generation Sequencing (NGS)
    1. CUSHAW: a software package for NGS read alignment on heterogeneous computing architectures:
      • CUSHAW (Release 1.x) is designed for CUDA-enabled GPUs and only provides support for ungapped alignments;
      • CUSHAW2 (Release 2.x) provides for gapped read alignment and is designed using multi-threading for multi-CPUs, which is one of the best state-of-the-art NGS read aligners.
    2. DecGPU: the first parallel and distributed pre-assembly short read error correction algorithm using CUDA and MPI.
    3. Musket: a parallel and scalable multistage k-mer spectrum based error corrector for Illumina sequence data
    4. PASHA: a parallelized short read assembler for large genomes, such as the human genome, using de Bruijn graphs.
  • Motif Discovery
    1. CUDA-MEME: a fast parallel motif finding algorithm based on MEME (version 3.5.4) algorithm for a single GPU device using CUDA.
    2. mCUDA-MEME: a further extension of CUDA-MEME based on MEME (version 4.4.0) algorithm for mutliple GPUs using a hybrid combination of CUDA, MPI and OpenMP.
    3. CompleteMOTIFs: an integrated web tool (collaboratively developed with Harvard Medical School) to facilitate systematic discovery of over-represented transcription factor binding motifs from high-throughput chromatin immunoprecipitation experiments, using CUDA-MEME to accelerate motif discovery.
  • Sequence Alignment
    1. CUDASW++: the fastest parallel Smith Waterman protein database search algorithm for GPGPUs using CUDA.
    2. MSAProbs: a new and practical multi-threaded multiple protein sequence alignment algorithm which produces the highest alignment accuracy compared to the existing leading aligners.


Links:


Awards:

  • Best Paper Award  (ASAP 2009)

Activities:

Invited Journal Reviewing

  • Nature Methods
  • Bioinformatics
  • Genome Biology
  • PLOS ONE
  • BMC Bioinformatics
  • IEEE/ACM Transactions on Computational Biology and Bioinformatics
  • IEEE Transactions on Very Large Scale Integration Systems
  • The Journal of Signal Processing Systems
  • BMC Research Notes
  • Computers in Biology and Medicine
  • EURASIP Journal on Wireless Communications and Networking
  • Microelectronics Journal

Invited Conference/Workshop Reviewing

  • CCGrid 2012, 2013
  • ICPADS 2012
  • ICPP 2012
  • InPar 2012
  • PPAM 2011

Service on Editorial Board and Program Committees:

  • 19th IEEE International Conference on Parallel and Distributed Systems (ICPADS 2013)
  • 18th IEEE International Conference on Parallel and Distributed Systems (ICPADS 2012)
  •  4th Workshop on Emerging Parallel Architectures (in conjunction with ICCS 2012)
  • 2013 Workshop on Parallel Computational Biology (in conjunction with PPAM 2013)

Publications:

Liu, Y, Schroeder, J, and Schmidt, B (2013).
Musket: a multistage k-mer spectrum based error corrector for Illumina sequence data
Bioinformatics, 29(3):308-315.

Liu, Y, Wirawan, A, and Schmidt, B (2013).
CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions
BMC Bioinformatics, 14(117).

Liu, Y and Schmidt, B (2012).
Evaluation of GPU-based seed generation for computational genomics using Burrows-Wheeler transform
In: 11th IEEE International Workshop on High Performance Computational Biology (HiCOMB 2012), pp. 684-690, IEEE.

Liu, Y and Schmidt, B (2012).
Long read alignment based on maximal exact match seeds
Bioinformatics, 28(18):i318-i324.

Liu, Y, Schmidt, B, and Maskell, DL (2012).
A fast CUDA compatible short read aligner to large genomes
In: GPU Technology Conference 2012 (GTC 2012), Poster.

Liu, Y, Schmidt, B, and Maskell, DL (2012).
CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows-Wheeler transform
Bioinformatics, 28(14):1830-1837.

Kuttippurathu, L, Hsing, M, Liu, Y, Schmidt, B, Maskell, DL, Lee, K, He, A, Pu, WT, and Kong, SW (2011).
CompleteMOTIFs: DNA motif discovery platform for transcription factor binding experiments.
Bioinformatics, 27(5):715-7.

Ligowski, L, Rudnicki, W, Liu, Y, and Schmidt, B (2011).
Accurate scanning of sequence databases with the Smith-Waterman algorithm
GPU Computing Gems:155-172.

Liu, W, Schmidt, B, Liu, Y, and Müller-Wittig, W (2011).
Mapping of the BLASTP algorithm onto GPU clusters
17th IEEE International Conference on Parallel and Distributed Systems (ICPADS 2011):236-243.

Liu, Y, Schmidt, B, and Maskell, DL (2011).
Parallelized short read assembly of large genomes using de Bruijn graphs.
BMC Bioinformatics, 12:354.


Selected Talks and Tutorials:

  • Evaluation of GPU-based seed generation for computational genomics using Burrows-Wheeler transform, at 26th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2012), May 2012
  • Long read alignment based on maximal exact match seeds, at 11th European Conference on Computational Biology (ECCB 2012), September 2012
  • Parallel and accurate gapped alignment for next-generation sequencing reads, at Tianjin Polytechnic University, Tianjin, China, October 2012.

Short Scientific CV:

Education:
  • 2008 - 2012 Ph.D degree

    school of computer engineering, Nanyang Technological University (Singapore),

  • 2005 - 2008 Master degree

     computer science and technology, Nankai University (China).

  • 2001 - 2005 Bachelor degree

     computer science and technology, Nankai University (China).

Work Experience:
  • 11/2011 - present Post-doc (Wissenschaftlicher Mitarbeiter)

    Institut für Informatik, Johannes Gutenberg Universität Mainz (Germany).


Teaching:

  • Laboratory Supervisor for Data Structures & Object-Oriented Programming (AY09/10-Semester 2; AY10/11-Semester 1) in NTU, Singapore.
  • Tutorials for Data Mining (WS 2011/2012) in JGU, Germany
  • Tutorials for Parallel Algorithms and Architectures (SS 2012) in JGU, Germany
  • Tutorials for High Performance Computing (WS 2012/2013) in JGU, Germany