Leveraging the Power of Big Data Tools for Large Scale Molecular Dynamics Analysis

MDReduce System Architecture.

Abstract

Parallel Molecular Dynamics simulations are generating atom trajectories of growing sizes and complexity. Analyzing these trajectories is computationally expensive and time consuming. One of the main reasons is the lack of tools that enable the computational biologist to easily implement the analysis while ensuring reduced processing times by exploiting the benefits of parallel computer architectures. In this paper, we present a comparison between two parallel analytics frameworks based on the Map/Reduce paradigm: HiMach, a dedicated framework for trajectory analysis based on MPI, and Flink, a Big Data analytics framework. Both frameworks enable to hide the complexity of parallel code creation to the programmer, providing significant performance gains compared to a sequential execution. These two frameworks are the core components of the MDReduce system, which tries to simplify the creation of parallel Molecular Dynamics analysis code.

Publication
In Jornadas Sarteco 2016 (XXVII Jornadas de Paralelismo (JP2016)).