DATA PARALLEL COMPUTATION OF EUCLIDEAN DISTANCE TRANSFORMS

1992 ◽  
Vol 02 (04) ◽  
pp. 331-339 ◽  
Author(s):  
TERRY BOSSOMAIER ◽  
NATALINA ISIDORO ◽  
ADRIAN LOEFF

The Euclidean Distance Transform is an important, but computationally expensive, technique of computational geometry, with applications in many areas including image processing, graphics and pattern recognition. Since the data sets used are typically large, one might hope that parallel computers would be suitable for its determination. We show that existing parallel algorithms perform poorly on certain data sets and introduce new strategies. These achieve high speed on diverse data sets, but fail occasionally in pathological cases. We determine the maximum error in such cases and demonstrate that it is satisfactorily low. Although adequate efficiency is achievable on SIMD machines, we demonstrate that problems of this kind are data parallel yet best suited to MIMD architectures.

2006 ◽  
Vol 22 (8) ◽  
pp. 1004-1010 ◽  
Author(s):  
Andrei Hutanu ◽  
Gabrielle Allen ◽  
Stephen D. Beck ◽  
Petr Holub ◽  
Hartmut Kaiser ◽  
...  

Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1816
Author(s):  
Hailun Xie ◽  
Li Zhang ◽  
Chee Peng Lim ◽  
Yonghong Yu ◽  
Han Liu

In this research, we propose two Particle Swarm Optimisation (PSO) variants to undertake feature selection tasks. The aim is to overcome two major shortcomings of the original PSO model, i.e., premature convergence and weak exploitation around the near optimal solutions. The first proposed PSO variant incorporates four key operations, including a modified PSO operation with rectified personal and global best signals, spiral search based local exploitation, Gaussian distribution-based swarm leader enhancement, and mirroring and mutation operations for worst solution improvement. The second proposed PSO model enhances the first one through four new strategies, i.e., an adaptive exemplar breeding mechanism incorporating multiple optimal signals, nonlinear function oriented search coefficients, exponential and scattering schemes for swarm leader, and worst solution enhancement, respectively. In comparison with a set of 15 classical and advanced search methods, the proposed models illustrate statistical superiority for discriminative feature selection for a total of 13 data sets.


2011 ◽  
Vol 84 (8) ◽  
Author(s):  
Tracy Holsclaw ◽  
Ujjaini Alam ◽  
Bruno Sansó ◽  
Herbie Lee ◽  
Katrin Heitmann ◽  
...  

2011 ◽  
Vol 314-316 ◽  
pp. 1717-1720
Author(s):  
Li Du ◽  
Wei Wang ◽  
Zhi Yong Song ◽  
Jie Xiong Ding

Thin-walled parts are widely used in aerospace engineering. For their complexity under loading and the higher shape precision, it’s difficult for their manufacturing on high speed machine. In order to understand manufacture process, characteristic of aviation part in high speed machining is investigated. Error sources on parts are classified and the maximum error, dynamic errors are studied on its main influence factors, such as cutting force and vibration. Finally, useful method on cutting test part is proposed, which can observe and control dynamic accuracy of aviation part and ensure effective manufacture.


2017 ◽  
Vol 6 (2) ◽  
pp. 12
Author(s):  
Abhith Pallegar

The objective of the paper is to elucidate how interconnected biological systems can be better mapped and understood using the rapidly growing area of Big Data. We can harness network efficiencies by analyzing diverse medical data and probe how we can effectively lower the economic cost of finding cures for rare diseases. Most rare diseases are due to genetic abnormalities, many forms of cancers develop due to genetic mutations. Finding cures for rare diseases requires us to understand the biology and biological processes of the human body. In this paper, we explore what the historical shift of focus from pharmacology to biotechnology means for accelerating biomedical solutions. With biotechnology playing a leading role in the field of medical research, we explore how network efficiencies can be harnessed by strengthening the existing knowledge base. Studying rare or orphan diseases provides rich observable statistical data that can be leveraged for finding solutions. Network effects can be squeezed from working with diverse data sets that enables us to generate the highest quality medical knowledge with the fewest resources. This paper examines gene manipulation technologies like Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) that can prevent diseases of genetic variety. We further explore the role of the emerging field of Big Data in analyzing large quantities of medical data with the rapid growth of computing power and some of the network efficiencies gained from this endeavor. 


PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0255307
Author(s):  
Fujun Wang ◽  
Xing Wang

Feature selection is an important task in big data analysis and information retrieval processing. It reduces the number of features by removing noise, extraneous data. In this paper, one feature subset selection algorithm based on damping oscillation theory and support vector machine classifier is proposed. This algorithm is called the Maximum Kendall coefficient Maximum Euclidean Distance Improved Gray Wolf Optimization algorithm (MKMDIGWO). In MKMDIGWO, first, a filter model based on Kendall coefficient and Euclidean distance is proposed, which is used to measure the correlation and redundancy of the candidate feature subset. Second, the wrapper model is an improved grey wolf optimization algorithm, in which its position update formula has been improved in order to achieve optimal results. Third, the filter model and the wrapper model are dynamically adjusted by the damping oscillation theory to achieve the effect of finding an optimal feature subset. Therefore, MKMDIGWO achieves both the efficiency of the filter model and the high precision of the wrapper model. Experimental results on five UCI public data sets and two microarray data sets have demonstrated the higher classification accuracy of the MKMDIGWO algorithm than that of other four state-of-the-art algorithms. The maximum ACC value of the MKMDIGWO algorithm is at least 0.5% higher than other algorithms on 10 data sets.


2021 ◽  
Vol 19 (3) ◽  
pp. 628-641
Author(s):  
F Faridah ◽  
Sentagi Utami ◽  
Ressy Yanti ◽  
S Sunarno ◽  
Emilya Nurjani ◽  
...  

This paper discusses an analysis to obtain the optimal thermal sensor placement based on indoor thermal characteristics. The method relies on the Computational Fluid Dynamics (CFD) simulation by manipulating the outdoor climate and indoor air conditioning (AC) system. First, the alternative sensor's position is considered the optimum installation and the occupant's safety. Utilizing the Standardized Euclidean Distance (SED) analysis, these positions are then selected for the best position using the distribution of the thermal parameters' values data at the activity zones. Onsite measurement validated the CFD model results with the maximum root means square error, RMSE, between both data sets as 0.8°C for temperature, the relative humidity of 3.5%, and an air velocity of 0.08m/s, due to the significant effect of the building location. The Standardized Euclidean Distance (SED) analysis results are the optimum sensor positions that accurately, consistently, and have the optimum % coverage representing the thermal condition at 1,1m floor level. At the optimal positions, actual sensors are installed and proven to be valid results since sensors could detect thermal variables at the height of 1.1m with SED validation values of 2.5±0.3, 2.2±0.6, 2.0±1.1, for R15, R33, and R40, respectively.


2020 ◽  
Author(s):  
Annika Tjuka ◽  
Robert Forkel ◽  
Johann-Mattis List

Psychologists and linguists have collected a great diversity of data for word and concept properties. In psychology, many studies accumulate norms and ratings such as word frequencies or age-of-acquisition often for a large number of words. Linguistics, on the other hand, provides valuable insights into relations of word meanings. We present a collection of those data sets for norms, ratings, and relations that cover different languages: ‘NoRaRe.’ To enable a comparison between the diverse data types, we established workflows that facilitate the expansion of the database. A web application allows convenient access to the data (https://digling.org/norare/). Furthermore, a software API ensures consistent data curation by providing tests to validate the data sets. The NoRaRe collection is linked to the database curated by the Concepticon project (https://concepticon.clld.org) which offers a reference catalog of unified concept sets. The link between words in the data sets and the Concepticon concept sets makes a cross-linguistic comparison possible. In three case studies, we test the validity of our approach, the accuracy of our workflow, and the applicability of our database. The results indicate that the NoRaRe database can be applied for the study of word properties across multiple languages. The data can be used by psychologists and linguists to benefit from the knowledge rooted in both research disciplines.


2014 ◽  
Vol 2 (1) ◽  
Author(s):  
Anne Dutfoy ◽  
Sylvie Parey ◽  
Nicolas Roche

AbstractIn this paper, we provide a tutorial on multivariate extreme value methods which allows to estimate the risk associated with rare events occurring jointly. We draw particular attention to issues related to extremal dependence and we insist on the asymptotic independence feature. We apply the multivariate extreme value theory on two data sets related to hydrology and meteorology: first, the joint flooding of two rivers, which puts at risk the facilities lying downstream the confluence; then the joint occurrence of high speed wind and low air temperatures, which might affect overhead lines.


2005 ◽  
Vol 13 (4) ◽  
pp. 277-298 ◽  
Author(s):  
Rob Pike ◽  
Sean Dorward ◽  
Robert Griesemer ◽  
Sean Quinlan

Very large data sets often have a flat but regular structure and span multiple disks and machines. Examples include telephone call records, network logs, and web document repositories. These large data sets are not amenable to study using traditional database techniques, if only because they can be too large to fit in a single relational database. On the other hand, many of the analyses done on them can be expressed using simple, easily distributed computations: filtering, aggregation, extraction of statistics, and so on. We present a system for automating such analyses. A filtering phase, in which a query is expressed using a new procedural programming language, emits data to an aggregation phase. Both phases are distributed over hundreds or even thousands of computers. The results are then collated and saved to a file. The design – including the separation into two phases, the form of the programming language, and the properties of the aggregators – exploits the parallelism inherent in having data and computation distributed across many machines.


Sign in / Sign up

Export Citation Format

Share Document