DATA PARALLEL COMPUTATION OF EUCLIDEAN DISTANCE TRANSFORMS

TERRY BOSSOMAIER; NATALINA ISIDORO; ADRIAN LOEFF

doi:10.1142/s0129626492000477

DATA PARALLEL COMPUTATION OF EUCLIDEAN DISTANCE TRANSFORMS

Parallel Processing Letters ◽

10.1142/s0129626492000477 ◽

1992 ◽

Vol 02 (04) ◽

pp. 331-339 ◽

Cited By ~ 6

Author(s):

TERRY BOSSOMAIER ◽

NATALINA ISIDORO ◽

ADRIAN LOEFF

Keyword(s):

High Speed ◽

Euclidean Distance ◽

Maximum Error ◽

Parallel Computers ◽

Data Sets ◽

Data Parallel ◽

Distance Transforms ◽

Diverse Data ◽

Mimd Architectures ◽

New Strategies

The Euclidean Distance Transform is an important, but computationally expensive, technique of computational geometry, with applications in many areas including image processing, graphics and pattern recognition. Since the data sets used are typically large, one might hope that parallel computers would be suitable for its determination. We show that existing parallel algorithms perform poorly on certain data sets and introduce new strategies. These achieve high speed on diverse data sets, but fail occasionally in pathological cases. We determine the maximum error in such cases and demonstrate that it is satisfactorily low. Although adequate efficiency is achievable on SIMD machines, we demonstrate that problems of this kind are data parallel yet best suited to MIMD architectures.

Download Full-text

Distributed and collaborative visualization of large data sets using high-speed networks

Future Generation Computer Systems ◽

10.1016/j.future.2006.03.026 ◽

2006 ◽

Vol 22 (8) ◽

pp. 1004-1010 ◽

Cited By ~ 22

Author(s):

Andrei Hutanu ◽

Gabrielle Allen ◽

Stephen D. Beck ◽

Petr Holub ◽

Hartmut Kaiser ◽

...

Keyword(s):

High Speed ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

High Speed Networks ◽

Collaborative Visualization

Download Full-text

Feature Selection Using Enhanced Particle Swarm Optimisation for Classification Models

Sensors ◽

10.3390/s21051816 ◽

2021 ◽

Vol 21 (5) ◽

pp. 1816

Author(s):

Hailun Xie ◽

Li Zhang ◽

Chee Peng Lim ◽

Yonghong Yu ◽

Han Liu

Keyword(s):

Feature Selection ◽

Particle Swarm ◽

Nonlinear Function ◽

Particle Swarm Optimisation ◽

Data Sets ◽

Discriminative Feature ◽

Advanced Search ◽

Selection For ◽

Selection Tasks ◽

New Strategies

In this research, we propose two Particle Swarm Optimisation (PSO) variants to undertake feature selection tasks. The aim is to overcome two major shortcomings of the original PSO model, i.e., premature convergence and weak exploitation around the near optimal solutions. The first proposed PSO variant incorporates four key operations, including a modified PSO operation with rectified personal and global best signals, spiral search based local exploitation, Gaussian distribution-based swarm leader enhancement, and mirroring and mutation operations for worst solution improvement. The second proposed PSO model enhances the first one through four new strategies, i.e., an adaptive exemplar breeding mechanism incorporating multiple optimal signals, nonlinear function oriented search coefficients, exponential and scattering schemes for swarm leader, and worst solution enhancement, respectively. In comparison with a set of 15 classical and advanced search methods, the proposed models illustrate statistical superiority for discriminative feature selection for a total of 13 data sets.

Download Full-text

Nonparametric reconstruction of the dark energy equation of state from diverse data sets

Physical Review D ◽

10.1103/physrevd.84.083501 ◽

2011 ◽

Vol 84 (8) ◽

Cited By ~ 43

Author(s):

Tracy Holsclaw ◽

Ujjaini Alam ◽

Bruno Sansó ◽

Herbie Lee ◽

Katrin Heitmann ◽

...

Keyword(s):

Dark Energy ◽

Equation Of State ◽

Energy Equation ◽

Data Sets ◽

Diverse Data

Download Full-text

Study on Dynamic Accuracy of CNC in Aviation Part

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.314-316.1717 ◽

2011 ◽

Vol 314-316 ◽

pp. 1717-1720

Author(s):

Li Du ◽

Wei Wang ◽

Zhi Yong Song ◽

Jie Xiong Ding

Keyword(s):

High Speed ◽

Influence Factors ◽

Maximum Error ◽

Error Sources ◽

Dynamic Accuracy ◽

Cutting Test ◽

Manufacture Process ◽

Shape Precision ◽

Dynamic Errors ◽

And Control

Thin-walled parts are widely used in aerospace engineering. For their complexity under loading and the higher shape precision, it’s difficult for their manufacturing on high speed machine. In order to understand manufacture process, characteristic of aviation part in high speed machining is investigated. Error sources on parts are classified and the maximum error, dynamic errors are studied on its main influence factors, such as cutting force and vibration. Finally, useful method on cutting test part is proposed, which can observe and control dynamic accuracy of aviation part and ensure effective manufacture.

Download Full-text

Future of Medical Research in Rare Diseases and Cancers: Shift from Pharma to Biotech and the Golden Age of Medical Advancement

Cancer and Clinical Oncology ◽

10.5539/cco.v6n2p12 ◽

2017 ◽

Vol 6 (2) ◽

pp. 12

Author(s):

Abhith Pallegar

Keyword(s):

Big Data ◽

Medical Research ◽

Rare Diseases ◽

Network Effects ◽

Medical Knowledge ◽

Economic Cost ◽

Medical Data ◽

Data Sets ◽

Leading Role ◽

Diverse Data

The objective of the paper is to elucidate how interconnected biological systems can be better mapped and understood using the rapidly growing area of Big Data. We can harness network efficiencies by analyzing diverse medical data and probe how we can effectively lower the economic cost of finding cures for rare diseases. Most rare diseases are due to genetic abnormalities, many forms of cancers develop due to genetic mutations. Finding cures for rare diseases requires us to understand the biology and biological processes of the human body. In this paper, we explore what the historical shift of focus from pharmacology to biotechnology means for accelerating biomedical solutions. With biotechnology playing a leading role in the field of medical research, we explore how network efficiencies can be harnessed by strengthening the existing knowledge base. Studying rare or orphan diseases provides rich observable statistical data that can be leveraged for finding solutions. Network effects can be squeezed from working with diverse data sets that enables us to generate the highest quality medical knowledge with the fewest resources. This paper examines gene manipulation technologies like Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) that can prevent diseases of genetic variety. We further explore the role of the emerging field of Big Data in analyzing large quantities of medical data with the rapid growth of computing power and some of the network efficiencies gained from this endeavor.

Download Full-text

A novel feature selection algorithm based on damping oscillation theory

PLoS ONE ◽

10.1371/journal.pone.0255307 ◽

2021 ◽

Vol 16 (8) ◽

pp. e0255307

Author(s):

Fujun Wang ◽

Xing Wang

Keyword(s):

Feature Selection ◽

Optimization Algorithm ◽

Euclidean Distance ◽

Oscillation Theory ◽

Feature Subset Selection ◽

Support Vector ◽

Data Sets ◽

Feature Subset ◽

Selection Algorithm ◽

Filter Model

Feature selection is an important task in big data analysis and information retrieval processing. It reduces the number of features by removing noise, extraneous data. In this paper, one feature subset selection algorithm based on damping oscillation theory and support vector machine classifier is proposed. This algorithm is called the Maximum Kendall coefficient Maximum Euclidean Distance Improved Gray Wolf Optimization algorithm (MKMDIGWO). In MKMDIGWO, first, a filter model based on Kendall coefficient and Euclidean distance is proposed, which is used to measure the correlation and redundancy of the candidate feature subset. Second, the wrapper model is an improved grey wolf optimization algorithm, in which its position update formula has been improved in order to achieve optimal results. Third, the filter model and the wrapper model are dynamically adjusted by the damping oscillation theory to achieve the effect of finding an optimal feature subset. Therefore, MKMDIGWO achieves both the efficiency of the filter model and the high precision of the wrapper model. Experimental results on five UCI public data sets and two microarray data sets have demonstrated the higher classification accuracy of the MKMDIGWO algorithm than that of other four state-of-the-art algorithms. The maximum ACC value of the MKMDIGWO algorithm is at least 0.5% higher than other algorithms on 10 data sets.

Download Full-text

Optimal thermal sensors placement based on indoor thermal environment characterization by using CFD model

Istrazivanja i projektovanja za privredu ◽

10.5937/jaes0-28985 ◽

2021 ◽

Vol 19 (3) ◽

pp. 628-641

Author(s):

F Faridah ◽

Sentagi Utami ◽

Ressy Yanti ◽

S Sunarno ◽

Emilya Nurjani ◽

...

Keyword(s):

Euclidean Distance ◽

Cfd Simulation ◽

Thermal Environment ◽

Thermal Characteristics ◽

Data Sets ◽

Cfd Model ◽

Outdoor Climate ◽

Computational Fluid Dynamics Cfd ◽

Maximum Root ◽

Optimum Sensor

This paper discusses an analysis to obtain the optimal thermal sensor placement based on indoor thermal characteristics. The method relies on the Computational Fluid Dynamics (CFD) simulation by manipulating the outdoor climate and indoor air conditioning (AC) system. First, the alternative sensor's position is considered the optimum installation and the occupant's safety. Utilizing the Standardized Euclidean Distance (SED) analysis, these positions are then selected for the best position using the distribution of the thermal parameters' values data at the activity zones. Onsite measurement validated the CFD model results with the maximum root means square error, RMSE, between both data sets as 0.8°C for temperature, the relative humidity of 3.5%, and an air velocity of 0.08m/s, due to the significant effect of the building location. The Standardized Euclidean Distance (SED) analysis results are the optimum sensor positions that accurately, consistently, and have the optimum % coverage representing the thermal condition at 1,1m floor level. At the optimal positions, actual sensors are installed and proven to be valid results since sensors could detect thermal variables at the height of 1.1m with SED validation values of 2.5±0.3, 2.2±0.6, 2.0±1.1, for R15, R33, and R40, respectively.

Download Full-text

Linking Norms, Ratings, and Relations of Words and Concepts Across Multiple Language Varieties

10.31234/osf.io/tgw3z ◽

2020 ◽

Author(s):

Annika Tjuka ◽

Robert Forkel ◽

Johann-Mattis List

Keyword(s):

Web Application ◽

Age Of Acquisition ◽

Data Curation ◽

Data Sets ◽

Data Types ◽

Word Meanings ◽

Language Varieties ◽

Diverse Data ◽

Multiple Languages ◽

Word Frequencies

Psychologists and linguists have collected a great diversity of data for word and concept properties. In psychology, many studies accumulate norms and ratings such as word frequencies or age-of-acquisition often for a large number of words. Linguistics, on the other hand, provides valuable insights into relations of word meanings. We present a collection of those data sets for norms, ratings, and relations that cover different languages: ‘NoRaRe.’ To enable a comparison between the diverse data types, we established workflows that facilitate the expansion of the database. A web application allows convenient access to the data (https://digling.org/norare/). Furthermore, a software API ensures consistent data curation by providing tests to validate the data sets. The NoRaRe collection is linked to the database curated by the Concepticon project (https://concepticon.clld.org) which offers a reference catalog of unified concept sets. The link between words in the data sets and the Concepticon concept sets makes a cross-linguistic comparison possible. In three case studies, we test the validity of our approach, the accuracy of our workflow, and the applicability of our database. The results indicate that the NoRaRe database can be applied for the study of word properties across multiple languages. The data can be used by psychologists and linguists to benefit from the knowledge rooted in both research disciplines.

Download Full-text

Multivariate Extreme Value Theory - A Tutorial with Applications to Hydrology and Meteorology

Dependence Modeling ◽

10.2478/demo-2014-0003 ◽

2014 ◽

Vol 2 (1) ◽

Cited By ~ 5

Author(s):

Anne Dutfoy ◽

Sylvie Parey ◽

Nicolas Roche

Keyword(s):

Extreme Value Theory ◽

High Speed ◽

Value Theory ◽

Extreme Value ◽

Asymptotic Independence ◽

Data Sets ◽

Overhead Lines ◽

Multivariate Extreme Value Theory ◽

Air Temperatures ◽

Multivariate Extreme Value

AbstractIn this paper, we provide a tutorial on multivariate extreme value methods which allows to estimate the risk associated with rare events occurring jointly. We draw particular attention to issues related to extremal dependence and we insist on the asymptotic independence feature. We apply the multivariate extreme value theory on two data sets related to hydrology and meteorology: first, the joint flooding of two rivers, which puts at risk the facilities lying downstream the confluence; then the joint occurrence of high speed wind and low air temperatures, which might affect overhead lines.

Download Full-text

Interpreting the Data: Parallel Analysis with Sawzall

Scientific Programming ◽

10.1155/2005/962135 ◽

2005 ◽

Vol 13 (4) ◽

pp. 277-298 ◽

Cited By ~ 217

Author(s):

Rob Pike ◽

Sean Dorward ◽

Robert Griesemer ◽

Sean Quinlan

Keyword(s):

Programming Language ◽

Regular Structure ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Data Parallel ◽

Web Document ◽

Distributed Computations ◽

Procedural Programming ◽

Two Phases

Very large data sets often have a flat but regular structure and span multiple disks and machines. Examples include telephone call records, network logs, and web document repositories. These large data sets are not amenable to study using traditional database techniques, if only because they can be too large to fit in a single relational database. On the other hand, many of the analyses done on them can be expressed using simple, easily distributed computations: filtering, aggregation, extraction of statistics, and so on. We present a system for automating such analyses. A filtering phase, in which a query is expressed using a new procedural programming language, emits data to an aggregation phase. Both phases are distributed over hundreds or even thousands of computers. The results are then collated and saved to a file. The design – including the separation into two phases, the form of the programming language, and the properties of the aggregators – exploits the parallelism inherent in having data and computation distributed across many machines.

Download Full-text