Biographical Sketch
I have over thirty years of research experience in experimental computer science and engineering at the university and industry. I am a professor of computer science at Old Dominion University, Norfolk, VA. My primary interest is in performance and portability issues on high-performance emerging architectures for scientific computing and big data analytics. In the past, I worked at IBM T.J. Watson Research Center as a research staff member, where I focused on developing optimized implementations of scientific kernels. My work was integrated into IBM products: Engineering Scientific Subroutine Library (ESSL) and Parallel ESSL. I am currently collaborating with NASA Langley, Intel, AMD, and Fermilab in porting and optimizing large scientific codes on emerging high-performance architectures. Recognizing my work, Intel recently established a oneAPI Center of Excellence at ODU. I have successfully obtained funds to support my research from NSF, DTIC, DoD, Fermilab, Jefferson Laboratory, NASA, Los Alamos, AFRL, NRL, JTASC, Sun Microsystems, Intel, and IBM. I have supervised twelve Ph.D. theses, over thirty-five MS projects/theses, and published over two-hundred papers in refereed conference proceedings and journals.
Ph.D., 1987, Thesis: Efficient Systolic Architectures for Matrix, Signal Processing, and Graph Related Problems, Indian Institute of Technology, Delhi, India
Teaching (Recent Courses)
CS 795/895 High-Performance Computing on Emerging Architectures
CS 495/895 Introduction to Data Science with Python
CS 795/895 High-Performance Computing on Emerging Architectures
CS 270 Introduction to Computer Architecture II
CS 170 Introduction to Computer Architecture I
Selected Publications
"PAGANI: a parallel adaptive GPU algorithm for numerical integration," Sakiotis, Ioannis, Kamesh Arumugam, Marc F. Paterno, Desh Ranjan, Balvsa Terzi'c and Mohammad Zubair, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (2021), Pages 1-13
"Memory Optimizations for Sparse Linear Algebra on GPU Hardware,"Walden, M. Zubair, C. P. Stone and E. J. Nielsen, 2021 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC), 2021, pp. 25-32, doi: 10.1109/MCHPC54807.2021.00010
"Accelerating unstructured-grid CFD algorithms on NVIDIA and AMD GPUs,"P. Stone, A. Walden, M. Zubair and E. J. Nielsen, 2021 IEEE/ACM 11th Workshop on Irregular Applications: Architectures and Algorithms (IA3), 2021, pp. 19-26, doi: 10.1109/IA354616.2021.00010
"Performance and Portability of a Linear Solver Across Emerging Architectures," Aaron C. Walden, Mohammad Zubair, and Eric J. Nielsen, to be published in Proceeding of Seventh Workshop on Accelerator Programming Using Directives (WACCPD) at SC 20, LNCS, Springer, 2021
"Performance Portability Issues for a Large-Scale Computational Fluid Dynamics Application on Emerging High-Performance Architectures," Aaron C. Walden, Mohammad Zubair, and Eric J. Nielsen, Performance, Portability, and Productivity in HPC Workshop, September 2020
"BLAS extensions for algebraic pricing methods.Paolo Regondi,"Mohammad Zubair, and Claudio Albanese,
In Proceedings of the 2nd Workshop on Parallel Programming for Analytics Applications (PPAA 2015), San Francisco, CA, 2015.
"A Portable and Fast Stochastic Volatility Model Calibration Using Multi and Many-Core Processors," M. Dixon, J. Lotze and M. Zubair,
High Performance Computational Finance (WHPCF), 2014 Seventh Workshop on, New Orleans, LA, 2014.
"Maximal clique enumeration for large graphs on hadoop framework. Naga Shailaja Dasari," Desh Ranjan, and Zubair Mohammad. 2014. In Proceedings of the first workshop on Parallel programming for analytics applications (PPAA '14), Orlando, Florida
N. S. Dasari, R. Desh and M. Zubair, "ParK: An efficient algorithm for k-core decomposition
on multicore processors," Big Data (Big Data), 2014 IEEE International Conference on,
Washington, DC, 2014.
K. Arumugam, A. Godunov, D. Ranjan, B. Terzic ´, and M. Zubair,
“An Efficient Deterministic Parallel Algorithm for Adaptive Multidimensional Numerical Integration on GPUs”,
2013 International Conference on Parallel Processing, Lyon, France, October 2013.
K. Arumugam, A. Godunov, D. Ranjan, B. Terzic ´, and M. Zubair,
“A Memory-Efficient Algorithm for Adaptive Multidimensional Integration with Multiple GPUs”,
IEEE International Conference on High Performance Computing (HiPC 2013), Hyderabad, India, December 2013.
"High Performance Implementation of an
Econometrics and Financial Application on GPUs", Michael Creel, and Mohammad Zubair, 5th Workshop on High Performance
Computational Finance (WHPCF’12) at SC12, 2012.
"Vertex Isoperimetric Parameter of a
Computation Graph", Desh Ranjan, and Mohammad Zubair, IJFCS, 2012.
"Upper and Lower Bounds
for Pebbling r Pyramids", Desh Ranjan, John E. Savage and Mohammad Zubair, Journal of Discrete Algorithms, published online December 7, 2011.
"Strong I/O lower bounds for binomial and FFT
computation graphs," Desh Ranjan, John Savage, and Mohammad Zubair. In Proceedings of the 17th annual international conference on
Computing and combinatorics (COCOON'11), Bin Fu and Ding-Zhu Du (Eds.). Springer-Verlag, Berlin, Heidelberg, 134-145.
"Solving Planted Motif Problem on GPU," GPUScA 2010, Austria, September 2010.
"An efficient multicore
implementation of planted motif problem," Naga Shailaja Desari, Desh Ranjan, Mohammad Zubair, High Performance Computing and
Simulation (HPCS), 2010 International Conference on , vol., no., pp.9-15, June 28 2010-July 2 2010
"Cache-optimal algorithms for option pricing," John E. Savage and Mohammad Zubair, ACM Trans. Math. Softw. 37, 1, Article 7, January 2010
"Evaluating multicore algorithms on the unified memory model," John E. Savage and Mohammad Zubair, Sci. Program. 17, 4, December 2009, 295-308
"A Scalable Parallel Block Algorithm for Band Cholesky Factorization," R.C. Agarwal, F.G. Gustavson, M. Joshi, and M. Zubair, Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, CA, USA, pp.430-435, Feb. 1995.
"An Efficient Parallel Algorithm for the 3-D NAS Parallel FFT Benchmark," R.C. Agarwal, F.G. Gustavson, and M. Zubair, Proceedings of IEEE Scalable High Performance Computing, pp.129-133, May 1994.
"A Very High Performance Algorithm for NAS EP Benchmark, R.C. Agarwal, F.G. Gustavson, and M. Zubair, High Performance Computing and Networking International Conference Proceeding, Munich, pp.164-169, April 1994.
“A High Performance Parallel Algorithm for 1-D FFT,” R.C. Agarwal, F.G. Gustavson, and M. Zubair, Proceedings of IEEE Supercomputing '94, Washington, DC, pp.14-18, November 1994.