Techniques for Efficient Execution of Large-Scale Scientific Workflows in Distributed Environments
Doctor of Philosophy (PhD)
First Advisor's Name
S. Masoud Sadjadi
First Advisor's Committee Title
Second Advisor's Name
Second Advisor's Committee Title
Third Advisor's Name
Third Advisor's Committee Title
Fourth Advisor's Name
Fourth Advisor's Committee Title
Scientific workflow, Workflow orchestration, Decentralized orchestration, Run-time adaptation, Multi-site workflow orchestration, Optimization patterns
Date of Defense
Scientific exploration demands heavy usage of computational resources for large-scale and deep analysis in many different fields. The complexity or the sheer scale of the computational studies can sometimes be encapsulated in the form of a workflow that is made up of numerous dependent components. Due to its decomposable and parallelizable nature, different components of a scientific workflow may be mapped over a distributed resource infrastructure to reduce time to results. However, the resource infrastructure may be heterogeneous, dynamic, and under diverse administrative control. Workflow management tools are utilized to help manage and deal with various aspects in the lifecycle of such complex applications. One particular and fundamental aspect that has to be dealt with as smooth and efficient as possible is the run-time coordination of workflow activities (i.e. workflow orchestration). Our efforts in this study are focused on improving the workflow orchestration process in such dynamic and distributed resource environments. We tackle three main aspects of this process and provide contributions in each of them. Our first contribution involves increasing the scalability and site autonomy in situations where the mapped components of a workflow span across several heterogeneous administrative domains. We devise and implement a generic decentralization framework for orchestration of workflows under such conditions. Our second contribution is involved with addressing the issues that arise due to the dynamic nature of such environments. We provide generic adaptation mechanisms that are highly transparent and also substantially less intrusive with respect to the rest of the workflow in execution. Our third contribution is to improve the efficiency of orchestration of large-scale parameter-sweep workflows. By exploiting their specific characteristics, we provide generic optimization patterns that are applicable to most instances of such workflows. We also discuss implementation issues and details that arise as we provide our contributions in each situation.
Kalayci, Selim, "Techniques for Efficient Execution of Large-Scale Scientific Workflows in Distributed Environments" (2014). FIU Electronic Theses and Dissertations. 1664.
Computational Engineering Commons, Computer and Systems Architecture Commons, Other Computer Sciences Commons, Software Engineering Commons, Systems Architecture Commons
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).