High Performance Multidimensional Analysis and Data Mining

Author(s):  
S. Goil ◽  
A. Choudhary

Predictive modelling is a mathematical technique which uses Statistics for prediction, due to the rapid growth of data over the cloud system, data mining plays a significant role. Here, the term data mining is a way of extracting knowledge from huge data sources where it’s increasing the attention in the field of medical application. Specifically, to analyse and extract the knowledge from both known and unknown patterns for effective medical diagnosis, treatment, management, prognosis, monitoring and screening process. But the historical medical data might include noisy, missing, inconsistent, imbalanced and high dimensional data.. This kind of data inconvenience lead to severe bias in predictive modelling and decreased the data mining approach performances. The various pre-processing and machine learning methods and models such as Supervised Learning, Unsupervised Learning and Reinforcement Learning in recent literature has been proposed. Hence the present research focuses on review and analyses the various model, algorithm and machine learning technique for clinical predictive modelling to obtain high performance results from numerous medical data which relates to the patients of multiple diseases.


2018 ◽  
Vol 7 (4.6) ◽  
pp. 13
Author(s):  
Mekala Sandhya ◽  
Ashish Ladda ◽  
Dr. Uma N Dulhare ◽  
. . ◽  
. .

In this generation of Internet, information and data are growing continuously. Even though various Internet services and applications. The amount of information is increasing rapidly. Hundred billions even trillions of web indexes exist. Such large data brings people a mass of information and more difficulty discovering useful knowledge in these huge amounts of data at the same time. Cloud computing can provide infrastructure for large data. Cloud computing has two significant characteristics of distributed computing i.e. scalability, high availability. The scalability can seamlessly extend to large-scale clusters. Availability says that cloud computing can bear node errors. Node failures will not affect the program to run correctly. Cloud computing with data mining does significant data processing through high-performance machine. Mass data storage and distributed computing provide a new method for mass data mining and become an effective solution to the distributed storage and efficient computing in data mining. 


Author(s):  
Antonio Congiusta ◽  
Domenico Talia ◽  
Paolo Trunfio

Knowledge discovery is a compute and data intensive process that allows for finding patterns, trends, and models in large datasets. The Grid can be effectively exploited for deploying knowledge discovery applications because of the high-performance it can offer and its distributed infrastructure. For effective use of Grids in knowledge discovery, the development of middleware is critical to support data management, data transfer, data mining and knowledge representation. To such purpose, we designed the Knowledge Grid, a high-level environment providing for Grid-based knowledge discovery tools and services. Such services allow users to create and manage complex knowledge discovery applications, composed as workflows that integrate data sources and data mining tools provided as distributed Grid services. This chapter describes the Knowledge Grid architecture and describes how its components can be used to design and implement distributed knowledge discovery applications. Then, the chapter describes how the Knowledge Grid services can be made accessible using the Open Grid Services Architecture (OGSA) model.


Author(s):  
Aydın Buluç ◽  
John R Gilbert

This paper presents a scalable high-performance software library to be used for graph analysis and data mining. Large combinatorial graphs appear in many applications of high-performance computing, including computational biology, informatics, analytics, web search, dynamical systems, and sparse matrix methods. Graph computations are difficult to parallelize using traditional approaches due to their irregular nature and low operational intensity. Many graph computations, however, contain sufficient coarse-grained parallelism for thousands of processors, which can be uncovered by using the right primitives. We describe the parallel Combinatorial BLAS, which consists of a small but powerful set of linear algebra primitives specifically targeting graph and data mining applications. We provide an extensible library interface and some guiding principles for future development. The library is evaluated using two important graph algorithms, in terms of both performance and ease-of-use. The scalability and raw performance of the example applications, using the Combinatorial BLAS, are unprecedented on distributed memory clusters.


2013 ◽  
Vol 347-350 ◽  
pp. 2993-2997
Author(s):  
Yue Li ◽  
Ran Liu

With the popularity and development of the network, the support of the high-performance computer technology becomes increasingly important as the huge information storage and the convenience of Information retrieval function of the internet that attracts more and more people join the netizens team. Therefore, I proposed an Information Processing Platform based on the high performance data mining in order to improve the Internet mass information intelligence parallel processing functions and the integrated development of the systems information storage, management, integration, intelligence processing, data mining and utilization. The propose of this system is to provide certain references and guidance for the technology implementation and realization of the high performance and high efficiency network massive Information Processing Platform as on the one hand, I have analyzed the key technology of the implementation of the platform, on the other hand briefly introduced the implementation of the RDIDC.


Sign in / Sign up

Export Citation Format

Share Document