Document Type

Dissertation

Degree

Doctor of Philosophy (PhD)

Major/Program

Computer Science

First Advisor's Name

Ming Zhao

First Advisor's Committee Title

Committee chair

Second Advisor's Name

Raju Rangaswami

Second Advisor's Committee Title

Committee member

Third Advisor's Name

Jason Liu

Third Advisor's Committee Title

Committee member

Fourth Advisor's Name

Gang Quan

Fourth Advisor's Committee Title

Committee member

Fifth Advisor's Name

Deng Pan

Fifth Advisor's Committee Title

Committee member

Sixth Advisor's Name

Seetharami Seelam

Sixth Advisor's Committee Title

Committee member

Keywords

Big data, Storage management, I/O scheduler, Performance

Date of Defense

3-18-2016

Abstract

Computing systems are becoming increasingly data-intensive because of the explosion of data and the needs for processing the data, and storage management is critical to application performance in such data-intensive computing systems. However, existing resource management frameworks in these systems lack the support for storage management, which causes unpredictable performance degradations when applications are under I/O contention. Storage management of data-intensive systems is a challenging problem because I/O resources cannot be easily partitioned and distributed storage systems require scalable management. This dissertation presents the solutions to address these challenges for typical data-intensive systems including high-performance computing (HPC) systems and big-data systems.

For HPC systems, the dissertation presents vPFS, a performance virtualization layer for parallel file system (PFS) based storage systems. It employs user-level PFS proxies to interpose and schedule parallel I/Os on a per-application basis. Based on this framework, it enables SFQ(D)+, a new proportional-share scheduling algorithm which allows diverse applications with good performance isolation and resource utilization. To manage an HPC system’s total I/O service, it also provides two complementary synchronization schemes to coordinate the scheduling of large numbers of storage nodes in a scalable manner.

For big-data systems, the dissertation presents IBIS, an interposition-based big-data I/O scheduler. By interposing the different I/O phases of big-data applications, it schedules the I/Os transparently to the applications. It enables a new proportional-share scheduling algorithm, SFQ(D2), to address the dynamics of the underlying storage by adaptively adjusting the I/O concurrency. Moreover, it employs a scalable broker to coordinate the distributed I/O schedulers and provide proportional sharing of a big-data system’s total I/O service.

Experimental evaluations show that these solutions have low-overhead and provide strong I/O performance isolation. For example, vPFS’ overhead is less than 3% in through- put and it delivers proportional sharing within 96% of the target for diverse workloads; and IBIS provides up to 99% better performance isolation for WordCount and 30% better proportional slowdown for TeraSort and TeraGen than native YARN.

Identifier

FIDC000251

Recommended Citation

Xu, Yiqi, "Storage Management of Data-intensive Computing Systems" (2016). FIU Electronic Theses and Dissertations. 2474.
https://digitalcommons.fiu.edu/etd/2474

Download

Included in

Computer and Systems Architecture Commons, Data Storage Systems Commons

COinS

DOI

10.25148/etd.FIDC000251

Rights Statement

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

FIU Electronic Theses and Dissertations

Storage Management of Data-intensive Computing Systems

Document Type

Degree

Major/Program

First Advisor's Name

First Advisor's Committee Title

Second Advisor's Name

Second Advisor's Committee Title

Third Advisor's Name

Third Advisor's Committee Title

Fourth Advisor's Name

Fourth Advisor's Committee Title

Fifth Advisor's Name

Fifth Advisor's Committee Title

Sixth Advisor's Name

Sixth Advisor's Committee Title

Keywords

Date of Defense

Abstract

Identifier

Recommended Citation

Included in

DOI

Rights Statement

Search

Links

Browse

Author Corner

FIU Electronic Theses and Dissertations

Storage Management of Data-intensive Computing Systems

Authors

Document Type

Degree

Major/Program

First Advisor's Name

First Advisor's Committee Title

Second Advisor's Name

Second Advisor's Committee Title

Third Advisor's Name

Third Advisor's Committee Title

Fourth Advisor's Name

Fourth Advisor's Committee Title

Fifth Advisor's Name

Fifth Advisor's Committee Title

Sixth Advisor's Name

Sixth Advisor's Committee Title

Keywords

Date of Defense

Abstract

Identifier

Recommended Citation

Included in

Share

DOI

Rights Statement

Search

Links

Browse

Author Corner