BioHPC Logo

Metagenomics and Metaproteomics Analysis on High-Performance Computing

  • CloudOmega
Scalable overlap-graph de novo metagenome assembler using Spark, cloud based large-scale data processing framework.
Main developer (Spark, GraphX, Amazon Cloud, C++, Python)

An overlap-graph de novo metagenome assembler
Co-developer (C++, Python)
B. Haider, T.-H. Ahn, B. Bushnell, J. Chai, A. Copeland, and C. Pan, "Omega: an Overlap-graph de novo Assembler for Metagenomics", Bioinformatics, vol. 30, issue 19, pp. 2717-2722, 2014.

Strain-level genome identification algorithm for biosurveillance using high-performance computing.
Main developer (C++, Python, MPI, OpenMP, Supercomputer)
- T.-H. Ahn, J. Chai, and C. Pan, "Sigma:Strain-level Inference of Genomes from Metagenomic Analysis for Biosurveillance", Bioinformatics, vol 31, issue 2, pp. 170-177, 2014.

Sipros is a database searching program for shotgun proteomics.
Co-developer (C++, Python, MPI, OpenMP, Supercomputer)
Y. Wang, T.-H. Ahn, Z. Li, and C. Pan, "Sipros/ProRata: a versatile informatics system for quantitative community proteomics", Bioinformatics, vol. 29, no. 16, pp. 2064-2065, 2013.

Parallel Dynamic Load Balancing for Ensembles of Stochastic Simulation & Implicit Stochastic Simulation Algorithm for Chemical Kinetics;

Improving the efficiency of the stochastic simulation algorithm (SSA) for chemical kinetics using numerical implicit methods.
Presenting a new probabilistic framework to analyze the performance of dynamic load balancing algorithms for ensembles of simulations.
Main-developer (Fortran 90, Matlab)
T.-H. Ahn, A. Sandu, and X. Han, "Implicit Simulation Methods for Stochastic Chemical Kinetics",  In press, Journal of Applied Analysis and Computation, 2015.
- T.-H. Ahn, A. Sandu, L.T. Watson, C.A. Shaffer, Y. Cao, and W.T. Baumann, "A Framework to Analyze the Performance of Load Balancing Schemes for Ensembles of Stochastic Simulations", International Journal of Parallel Programming, 2014.
T.-H. Ahn and A. Sandu, "Implicit Second Order Weak Taylor Tau-Leaping Methods for the Stochastic Simulations of Chemical Kinetics", Procedia Computer Science, Volume 4, pp 2297?2306, International Conference on Computational Science, ICCS 2011, 2011.
T.-H. Ahn and A. Sandu, "Fully Implicit Tau-Leaping Methods for the Stochastic Simulation of Chemical Kinetics", in Proceedings of the 19th High Performance Computing Symposium (HPC 2011) part of the 2011 Spring Simulation Multiconference, ser. SpringSim'11, Boston, MA, USA: Society for Computer Simulation International, 2011.
T.H. Ahn and A. Sandu, "Parallel Stochastic Simulations of Budding Yeast Cell Cycle: Load Balancing Strategies and Theoretical Analysis", in Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology, ser. BCB '10, New York, NY, USA:ACM, pp. 237?246, 2010.

My role for the project was developing algorithms and simulating cell cycle model with stochastic methods.
Co-developer (Java, Matlab)
T.-H. Ahn, L. T. Watson, Y. Cao, C. A. Shaffer, and W. T. Baumann, "Cell cycle modeling for budding yeast with stochastic simulation algorithms", Computer Modeling in Engineering and Sciences, vol. 51, no. 1, pp. 27?52, 2009.
T.-H. Ahn, P. Wang, L.T. Watson, Y. Cao, C.A. Shaffer, and W.T. Baumann, "Stochastic cell cycle modeling for budding yeast" in Proceedings of the 2009 Spring Simulation Multiconference, ser. SpringSim '09, San Diego, CA, USA: Society for Computer Simulation International, pp. 113:1?113:6, 2009.
T.-H. Ahn, Y. Cao, and L.T. Watson, "Stochastic Simulation Algorithms for Chemical Reactions," in Proceedings of the 2008 International Conference on Bioinformatics & Computational Biology, BIOCOMP'08, Las Vegas, Nevada, USA, pp. 431?436, July, 2008.

Finding Missing Genes of Aedes aegypti, yellow fever mosquito, with NGS RNA-Seq and Bioinformatics

Finding missing genes (new and mis-annotated genes) of dangerous mosquito species with mosquitoes' deep sequencing NGS transriptome data using bioinformatics software tools. 
- RNA-Seq, PERL, R, GBrowse, My-SQL

MATLAB on the HPC Grid: Maximizing and Optimizing the Capability in Modeling and Simulation (2011 Summer Intern at Pfizer Research)

Computer based modeling and simulation in drug development is becoming of the upmost importance to reduce the cost and time of making medicines in many large pharmaceutical companies. The MathWorks' MATLAB is one of the widely used tools for computational modeling of high-throughput quantitative biology. MATLAB provides built-in parallel computing products to take advantage of HPC resources requiring little code modification. In here I investigated the computational efficiency of MATLAB in the HPC environment, parallelized several biological models, and simulated in parallel to get high efficiency of the HPC. 

- MATLAB Parallel Computing Toolbox, Distributed Computing Server

Macroscale Simulator Evaluation (2010 Summer Intern at Sandia National Laboratories)

I investigated massive parallel genomic search application, mpiBLAST, on a macroscale simulator (SST/Macro) that allows distributed memory applications especially the Message Passing Interface (MPI). 

Main developer (C++, Python, MPI, OpenMP, Supercomputer)

D. Dechev and T.-H. Ahn, "Using SST/macro for Effective Analysis of MPI-based Applications: Evaluating Large-Scale Genomic Sequence Search", IEEE Access, vol. 1, pp. 428-435, 2013.

T.-H. Ahn, D. Dechev, H. Lin, H. Adalsteinsson and C. Janssen, "Evaluating Performance Optimizations of Large-Scale Genomic Sequence Search Applications Using SST/macro", in Proceedings of International Conference on Simulation and Modeling Methodologies, Technologies and Applications (SIMULTECH 2011), Noordwijkerhout, Netherlands, 2011.