CLC number: TP393
On-line Access: 2011-11-30
Received: 2011-04-14
Revision Accepted: 2011-09-15
Crosschecked: 2011-11-04
Cited: 6
Clicked: 9580
Jian-zong Wang, Peter Varman, Chang-sheng Xie. Optimizing storage performance in public cloud platforms[J]. Journal of Zhejiang University Science C, 2011, 12(12): 951-964.
@article{title="Optimizing storage performance in public cloud platforms",
author="Jian-zong Wang, Peter Varman, Chang-sheng Xie",
journal="Journal of Zhejiang University Science C",
volume="12",
number="12",
pages="951-964",
year="2011",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.C1100097"
}
%0 Journal Article
%T Optimizing storage performance in public cloud platforms
%A Jian-zong Wang
%A Peter Varman
%A Chang-sheng Xie
%J Journal of Zhejiang University SCIENCE C
%V 12
%N 12
%P 951-964
%@ 1869-1951
%D 2011
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.C1100097
TY - JOUR
T1 - Optimizing storage performance in public cloud platforms
A1 - Jian-zong Wang
A1 - Peter Varman
A1 - Chang-sheng Xie
J0 - Journal of Zhejiang University Science C
VL - 12
IS - 12
SP - 951
EP - 964
%@ 1869-1951
Y1 - 2011
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.C1100097
Abstract: Cloud computing is an elastic computing model where users can lease computing and storage resources on demand from a remote infrastructure. It is gaining popularity due to its low cost, high reliability, and wide availability. With the emergence of public cloud storage platforms like Amazon, Microsoft, and Google, individual applications and enterprise storage are being deployed on Clouds. However, a serious impediment to its wider deployment is the relative lack of effective data management services. Our experiments, as well as industry reports, have shown that the performance and service-level agreement (SLA) cannot be guaranteed when the data is served over public Clouds. The relatively slow access to persistent data and large variability in cloud storage I/O performance can significantly degrade the performance of data-intensive applications. This paper addresses the issue of I/O performance fluctuation over public cloud platforms and we propose a middleware called CloudMW between the cloud storage and clients to provide the storage services with better performance and SLA satisfaction. Some technologies, including data virtualization, data chunking, caching, and replication, are integrated into CloudMW to achieve a more stable and predictable performance, and permit flexible sharing of storage among the virtual machines (VMs). Experimental results based on Amazon Web Services (AWS) show that CloudMW is able to improve the stability and help provide better SLAs and data sharing for cloud storage.
[1]Amazon AWS, 2010. Amazon Web Services. Available from http://aws.amazon.com [Accessed on Sept. 21, 2010].
[2]Amazon EBS, 2010. Amazon Elastic Block Store. Available from http://aws.amazon.com/ebs/ [Accessed on Sept. 21, 2010].
[3]Amazon EC2, 2010. Amazon Elastic Compute Cloud. Available from http://aws.amazon.com/ec2 [Accessed on Sept. 21, 2010].
[4]Amazon S3, 2010. Amazon Simple Storage Service. Available from http://aws.amazon.com/s3/ [Accessed on Sept. 21, 2010].
[5]Armbrust, M., Fox, A., Griffith, R., Joseph, A.D., Katz, R.H., Konwinski, A., Lee, G., Patterson, D.A., Rabkin, A., Stoica, I., et al., 2009. Above the Clouds: a Berkeley View of Cloud Computing. Technical Report, No.~UCB/EECS-2009-28, University of California, Berkeley, CA.
[6]Brantner, M., Florescu, D., Graf, D., Kossmann, D., Kraska, T., 2008. Building a Database on S3. Proc. ACM SIGMOD Int. Conf. on Management of Data, p.251-264.
[7]DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W., 2007. Dynamo: Amazon’s highly available key-value store. ACM SIGOPS Oper. Syst. Rev., 41(6):205-220.
[8]FUSE, 2010. Filesystem in Userspace. Available from http://fuse.sourceforge.net [Accessed on Aug. 21, 2010].
[9]Gulati, A., Merchant, A., Varman, P., 2007. pClock: an Arrival Curve Based Approach for {QoS} in Shared Storage Systems. Proc. ACM SIGMETRICS Int. Conf. on Measurement and Modeling of Computer Systems, p.13-24.
[10]Gulati, A., Merchant, A., Varman, P., 2010. mClock: Handling Throughput Variability for Hypervisor IO Scheduling. 9th USENIX Symp. on Operating Systems Design and Implementation, p.1-7.
[11]International Data Corporation, 2010. Citing Statistics Information. Available from http://www.idc.com/ [Accessed on Sept. 23, 2010].
[12]Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Weimer, W., et al., 2000. OceanStore: an Architecture for Global-Scale Persistent Storage Platforms. Proc. 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, p.190-201.
[13]Lim, H.C., Babu, S., Chase, J.S., 2010. Automated Control for Elastic Storage. Proc. 7th Int. Conf. on Autonomic Computing, p.1-10.
[14]Mahajan, P., Setty, S., Lee, S., Clement, A., Alvisi, L., Dahlin, M., Walfish, M., 2010. Depot: Cloud Storage with Minimal Trust. 9th USENIX Symp. on Operating Systems Design and Implementation, p.1-12.
[15]Palankar, M.R., Iamnitchi, A., Ripeanu, M., Garfinkel, S., 2008. Amazon S3 for Science Grids: a Viable Solution? Proc. Int. Workshop on Data-Aware Distributed Computing, p.1-9.
[16]Postmark, 1997. Postmark: a New File System Benchmark. Available from http://packages.debian.org/stable/utils/postmark [Accessed on Aug. 21, 2010].
[17]VMware ESX, 2010. VMware ESX and ESXi, Bare-Metal Hypervisor for Virtual Machines. Available from http://www.vmware.com/products/esx/ [Accessed on Aug. 21, 2010].
[18]Walker, E., 2008. Benchmarking Amazon EC2 for high-performance scientic computing. USENIX Log. Mag., 33(5):18-23.
[19]Wang, G.H., Ng, T.S.E., 2010. The Impact of Virtulalization on Network Performance of Amazon EC2 Data Center. INFOCOM, p.1-9.
Open peer comments: Debate/Discuss/Question/Opinion
<1>