High Performance Computing

Faculty members in the high-end computing group are participating in projects that involve algorithm development for performance optimization in scientific computing, software synthesis for computational field simulations on high-end computing platforms, distributed interactive simulation frameworks, resource allocation on high-end computing platforms, and model-driven software architectures to support operating system design for real-time and secure systems.

High Performance Computing Faculty

Research Projects

High Performance Computing Lab

The High-Performance Computing Laboratory (HPCL) under the direction of Dr. Yoginder Dandass is involved in the areas of research outlined below.

Computer Security

In this area, researchers in the HPCL, in collaboration with researchers in the Center for Computer Security Research (CCSR), are investigating techniques for implementing intrusion and anomaly detection algorithms in hardware. The main goal is to reduce the overhead of security codes in high-performance parallel clusters. In one project funded by the National Science Foundation (NSF), researchers are developing dynamic, multi-resolution sensors that adapt to the current threat assessment. In another internally funded project, researchers are investigating techniques for detecting viruses and worms at the level of switches and routers.

Real-time Scheduling

In this area, researchers are looking to schedule complex computations for embedded computing in order to maximize resource utilization while working under constraints on power, weight, volume, and heat generation. The targeted systems include mobile augmented reality systems which are incorporated into the clothing of users. These systems provide real-time enhancement of situational awareness in several applications (e.g., urban combat and firefighting). Currently, these systems typically dedicate processors to various subsystems - this is can lead to inefficiencies because the dedicated processors are underutilized. The main goal of this research is to schedule the various tasks onto a number of processors in order to fully utilize the available computing resources in an efficient manner in the face of variable computing demands from the various systems.

The Loci Framework

Loci

This research project focuses on the development of software synthesis tools for the development of high performance software. Development of high performance software for modern massively parallel clusters is challenging. This research project seeks to simplify the process through automatic assembly of complex applications from simple high level specifications. This allows for features such as automatic extraction of parallelism for massively parallel architectures, but also automated checks for the logical consistency of numerical models. This work is supported by NASA and NSF and is currently in use by NASA engineers in the design of next generation space transportation systems.

The CHEM Code

The CHEM Code

This research project seeks to develop a state-of-the-art multidisciplinary simulation system using the Loci simulation synthesis framework. Currently this code is able to perform simulations on combustion problems involving non-equilibrium chemistry (where chemical reactions happen over finite intervals of time). We have also performed multidisciplinary simulations (enabled by the Loci framework) that has included both fluid mechanics of rocket combustion problems fully coupled to thermal stress solid mechanics analysis in one unified parallel framework. Currently, this system is being used by NASA engineers to develop next generation space transportation systems.

Dynamic Scheduling & Load Balancing Algorithm Development for Performance Optimization in Scientific Computing

Goal

Problems in scientific computing are in general large, irregular and computationally intensive. Load imbalance is one of the main performance degradation factors of scientific applications running in heterogeneous environments often used in cluster and grid computing. Our research objective is to improve performance of parallel algorithms for applications in science and engineering, such that simulations of physical phenomena they represent could derive tractable and accurate predictions. Our specific goal is to advance the state-of-the-art in dynamic scheduling and load balancing algorithms for improving the scalability and performance of parallel applications in scientific computing. Therefore, our research efforts are directed towards the analysis and development of these algorithms on theoretical and experimental bases.

Our research activities are focused towards the development of algorithms, techniques and tools that address load imbalance factors generated by the unpredictable behavior of simulations, such as irregularities rising from problem characteristics, algorithms, and software environments. The development of these algorithms is essential, especially in applications characterized by highly irregular behavior, or by a continuous and dynamic change, where none of the existing techniques accommodate their unpredictable behavior. In addition to developing the theoretical foundations of dynamic scheduling and load balancing algorithms, our research group develops tools for the integration of these algorithms into applications, as well as into run-time environments. Our research activities are supported by grants from the National Science Foundation (NSF-CAREER, ITR, and others).

Another important goal of our work consists in performance analysis, evaluation and prediction (from analytical and experimental perspectives) of parallel applications running in heterogeneous environments using these novel techniques.

Methodology

Since loops are an important source of parallelism in scientific computing, the development of our dynamic scheduling and load balancing algorithms are based on theoretical advances in loop scheduling. The techniques developed are based on probabilistic analyses. Over time, we developed a number of adaptive algorithms which could address a wide range of irregularities. From a historical perspective, the algorithms developed in recent years have a higher degree of generality than the ones developed earlier (and therefore, they are robust), since some of the theoretical constraints used in modeling are progressively relaxed. The theoretical results of this work are significant, especially to the research community interested in scalability analysis of parallel applications.

Another component of our group's research work is the development of application programming interfaces, libraries, and tools, to facilitate the integration of these newly developed techniques into parallel applications running in heterogeneous environments. Furthermore, the dynamic scheduling techniques are integrated into runtime systems to ensure load balancing at both task and data parallel levels.

A research initiative of our group lies in the introduction of optimization through machine learning for the use of our dynamic scheduling algorithms into parallel applications. Therefore, a faster assimilation of our research work would be possible in the emerging technology of autonomic computing.