A new fuzzy optimal data replication method for data grid


Zeinab Ghilavizadeh, Seyed Javad Mirabedini and Ali Harounabadi


These days, There are several applications where we face with large data set and it has become an important part of common resources in different scientific areas. In fact, there are many applications where there are literally huge amount of information handled either in terabyte or in petabyte. Many scientists apply huge amount of data distributed geographically around the world through advanced computing systems. The huge volume data and calculations have created new problems in accessing, processing and distribution of data. The challenges of data management infrastructure have become very difficult under a large amount of data, different geographical spaces, and complicated involved calculations. Data Grid is a remedy to all mentioned problems. In this paper, a new method of dynamic optimal data replication in data grid is introduced where it reduces the total job execution time and increases the locality in accessibilities by detecting and impacting the factors influencing the data replication. Proposed method is composed of two main phases. During the first phase is the phase of file application and replication operation. In this phase, we evaluate three factors influencing the data replication and determine whether the requested file can be replicated or it can be used from distance. In the second phase or the replacement phase, the proposed method investigates whether there is enough space in the destination to store the requested file or not. In this phase, the proposed method also chooses a replica with the lowest value for deletion by considering three replica factors to increase the performance of system. The results of simulation also indicate the improved performance of our proposed method compared with other replication methods represented in the simulator Optorsim.


DOI: j.msl.2013.01.028

Keywords: Data Grid ,Dynamic data replication ,Data replacement ,Replica ,Fuzzy Logic

How to cite this paper:

Ghilavizadeh, Z., Mirabedini, S & Harounabadi, A. (2013). A new fuzzy optimal data replication method for data grid .Management Science Letters, 3(3), 927-936.


References

Abdurrab, A.R., & Xie, T. ( 2010). FIRE: A File Reunion Based Data Replication Strategy for Data Grid. 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, 1-9.

Amjad, T., Sher, M., & Daud, A. (2012). A survey of dynamic replication strategies for improving data availability in data grids. Future Generation Computer Systems, 28, 337–349.

Bell, W.H., Cameron, D.G., Capozza, L., Millar, A.P., Stockinger, K., & Zini, F. (2002). Simulation of Dynamic Grid Replication Strategies in OptorSim. International Journal of High performance Computing Applications, 46-57.Cameron, D.G., Millar, A.P., Nicholson, C., Schiaffino, R., Zini, F., & Stockinger, K. (2003). OptorSim: a simulation tool for scheduling and replica optimization in data grids. International Journal of High Performance Computing Applications, 17, 403-416.Chaprada, F.B., Ounelli, H., & Chettaoui, H. (2010). An Efficient Replication Strategy for Dynamic Data Grids. 1st International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, 50-54.

Chervenak, A., Foster, I., Kesselman, C., Salisbury, C., & Tuecke, S. (2000). The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets. Journal of Network and Computer Applications, 23 (3), 187–200.

EDG – The European DataGrid Project, http://www.edg.org .

Foster, I. (2002). The grid: A new infrastructure for 21st century science. Physics Today, 55(2), 42-47.

Lamehamedi, H., & Szymanski, B. (2002). Data replication strategies in grid environments, 5th International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP, 378.

OptorSim – A Replica Optimiser Simulation, http://grid-data-management.web.cern.ch/grid-data-management/optimization/optor.

Rahman, R.M., Barker K., & Alhajj, R. (2005). Replica placement in data grid: Considering utility and risk. IEEE International Conference on Information Technology, 1, 354-359.

Ranganathan, K., & Foster, I. ( 2001). Design and evaluation of dynamic replication strategies for a high performance data grid. 1st International Conference on Computing in High Energy and Nuclear Physics, 45-51.

Ranganathan, K., Iamnitchi, A., & Foster, I. (2002). Improving data availability through dynamic model-driven replication in large peer-to-peer communities. IEEE/ACM 2TH International Symposium on Cluster Computing and the Grid, CCGrid, 376.

Saadat, N., & Rahmani, A.M. (2012). PDDRA: A new pre-fetching based dynamic data replication algorithm in data grids. Future Generation Computer Systems, 28(4), 666–681.

Schintke, F., & Reinefeld, A. (2003). Modeling replica availability in large data grids. Journal of Grid Computing, 1(2), 219–227.

Stockinger, H., Samar, A., Holtman, K., Allcock, B., Foster, I., & Tierney, B. (2002). File and object replication in data grids. Cluster Computing, 5(3), 305–314.

Vazhkudai, S., Tuecke, S., & Foster, I. (2001). Replica selection in the globus data grid. First IEEE/ACM International Symposium on Cluster Computing and the Grid, CCGrid, 16-113.

Yuan, Y., Wu, Y., Yang, G., & Yu, F. (2007). Dynamic data replication based on local optimization principle in data grid. IEEE 6TH International Conference Grid and Cooperative Computing, 815-822.

Zhao, W., Xu, X., Wang, Z., Zhang, Y., & He, S. (2010). A Dynamic Optimal Replication Strategy in Data Grid Environment. IEEE International Conference on Internet Technology and Applications (ITAP 2010), 1-4.