SP-Partitioner: A novel partition method to handle intermediate data skew in spark streaming

Future Generation Computer Systems ◽

10.1016/j.future.2017.07.014 ◽

2018 ◽

Vol 86 ◽

pp. 1054-1063 ◽

Cited By ~ 11

Author(s):

Guipeng Liu ◽

Xiaomin Zhu ◽

Ji Wang ◽

Deke Guo ◽

Weidong Bao ◽

...

Keyword(s):

Data Skew ◽

Intermediate Data ◽

Partition Method

Download Full-text

Idempotent Task Cache System for Handling Intermediate Data Skew in MapReduce on Cloud Computing

2016 International Computer Symposium (ICS) ◽

10.1109/ics.2016.0111 ◽

2016 ◽

Cited By ~ 1

Author(s):

Tzu-Chi Huang ◽

Kuo-Chih Chu ◽

Jia-Hui Lin ◽

Ce-Kuen Shieh

Keyword(s):

Cloud Computing ◽

Data Skew ◽

Intermediate Data ◽

Cache System

Download Full-text

Intermediate Data Placement Strategy for Different Data Skew Levels Based on Random Sampling in Spark

Proceedings of the 2019 4th International Conference on Big Data and Computing - ICBDC 2019 ◽

10.1145/3335484.3335495 ◽

2019 ◽

Author(s):

Xueqian Gong ◽

Chunlin Li ◽

Youlong Luo

Keyword(s):

Random Sampling ◽

Data Placement ◽

Data Skew ◽

Intermediate Data

Download Full-text

ImRP: A Predictive Partition Method for Data Skew Alleviation in Spark Streaming Environment

Parallel Computing ◽

10.1016/j.parco.2020.102699 ◽

2020 ◽

Vol 100 ◽

pp. 102699

Author(s):

Zhongming Fu ◽

Zhuo Tang ◽

Li Yang ◽

Kenli Li ◽

Keqin Li

Keyword(s):

Data Skew ◽

Partition Method

Download Full-text

Smart Partitioning Mechanism for Dealing with Intermediate Data Skew in Reduce Task on Cloud Computing

2017 IEEE 31st International Conference on Advanced Information Networking and Applications (AINA) ◽

10.1109/aina.2017.44 ◽

2017 ◽

Cited By ~ 4

Author(s):

Tzu-Chi Huang ◽

Kuo-Chih Chu ◽

Guo-Hao Huang ◽

Yan-Chen Shen ◽

Ce-Kuen Shieh

Keyword(s):

Cloud Computing ◽

Data Skew ◽

Intermediate Data

Download Full-text

Workload Alleviation Scheduling Framework to Alleviate Negative Performance Impact of Intermediate Data Skew in Small-Scale MapReduce Cloud

2018 International Conference on System Science and Engineering (ICSSE) ◽

10.1109/icsse.2018.8520003 ◽

2018 ◽

Author(s):

Tzu-Chi Huang ◽

Kuo-Chih Chu ◽

Jia-Huei Lin ◽

Guo-Hao Huang ◽

Ce-Kuen Shieh

Keyword(s):

Small Scale ◽

Performance Impact ◽

Data Skew ◽

Intermediate Data

Download Full-text

Handling Imbalance Data in Reduce task of MapReduce in Cloud Environment

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i11.498 ◽

2017 ◽

Vol 7 (11) ◽

pp. 168

Author(s):

Chetana Tukkoji ◽

Seetharam K

Keyword(s):

Ad Hoc ◽

Programming Model ◽

Low Cost ◽

Large Data ◽

Map Reduce ◽

Data Sets ◽

Parallel Database ◽

Data Skew ◽

Intermediate Data ◽

The People

There is a growing need for an ad-hoc analysis of extremely large data sets, especially at web based companies where innovation critically depends on being able to analyze terabytes of data collected every day. Parallel database products, over a solution, but are usually prohibitively ex-pensive at this scale. But, most of the people who analyze data are called procedural programmers. The success of the more procedural map-reduce programming model and its associated scalable implementations on commodity hardware (low cost), is evidence of the above. However, the map-reduce paradigm is too low-level and rigid, and leads to a great deal of custom user code that is hard to maintain, and reuse. The map reduce is an effective tool for parallel data processing. One significant issue in practical map reduce application is the data skew. The imbalance of the amount of the data assigned to each tasks to take much longer to finish than the others. Now we need to propose a framework, to solve the data skew problem to reduce side application in the map reduce. It usage a innovative sampling of the data input accurate approximation to the distribution of the intermediate data by sampling only small fraction of the intermediate data. It does not contain the any type of the data to prevent the overlap between the maps and reduce stages.

Download Full-text

A Low-Complexity PTS Scheme with the Hybrid Subblock Partition Method for PAPR Reduction in OFDM Systems

IEICE Transactions on Communications ◽

10.1587/transcom.e98.b.2341 ◽

2015 ◽

Vol E98.B (11) ◽

pp. 2341-2347 ◽

Cited By ~ 1

Author(s):

Sheng-Ju KU ◽

Yuan OUYANG ◽

Chiachi HUANG

Keyword(s):

Low Complexity ◽

Papr Reduction ◽

Ofdm Systems ◽

Partition Method

Download Full-text

Solution of Certain Initial Control Problems by the Optimal Set Partition Method

Journal of Automation and Information Sciences ◽

10.1615/jautomatinfscien.v29.i4-5.90 ◽

1997 ◽

Vol 29 (4-5) ◽

pp. 89-98

Author(s):

Vladimir E. Kapustyan ◽

Elena M. Kiseleva ◽

L. S. Krokha

Keyword(s):

Control Problems ◽

Set Partition ◽

Initial Control ◽

Partition Method ◽

Optimal Set

Download Full-text

Ray tracing algorithm based on octree space partition method

Journal of Computer Applications ◽

10.3724/sp.j.1087.2008.00656 ◽

2008 ◽

Vol 28 (3) ◽

pp. 656-658 ◽

Cited By ~ 2

Author(s):

Wen-xi WANG

Keyword(s):

Ray Tracing ◽

Space Partition ◽

Tracing Algorithm ◽

Partition Method

Download Full-text

An Interactive Graph Partition Method Through Combination of Multi-scale Analysis and Level Set

JOURNAL OF ELECTRONICS INFORMATION TECHNOLOGY ◽

10.3724/sp.j.1146.2012.00005 ◽

2013 ◽

Vol 34 (9) ◽

pp. 2078-2084 ◽

Cited By ~ 1

Author(s):

Yun-fei Wang ◽

Du-yan Bi ◽

De-qin Shi ◽

Tian-jun Huang ◽

Di Liu

Keyword(s):

Level Set ◽

Graph Partition ◽

Scale Analysis ◽

Multi Scale ◽

Partition Method ◽

Multi Scale Analysis

Download Full-text