Efficient Approaches to the Mixture Distance Problem

Justie Su-Tzu Juan; Yi-Ching Chen; Chen-Hui Lin; Shu-Chuan Chen

doi:10.3390/a13120314

Efficient Approaches to the Mixture Distance Problem

Algorithms ◽

10.3390/a13120314 ◽

2020 ◽

Vol 13 (12) ◽

pp. 314

Author(s):

Justie Su-Tzu Juan ◽

Yi-Ching Chen ◽

Chen-Hui Lin ◽

Shu-Chuan Chen

Keyword(s):

Model Building ◽

Internal Node ◽

Path Difference ◽

Biological Species ◽

Binary Sequences ◽

Computational Time ◽

Distance Metric ◽

Weighted Tree ◽

Tree Comparison ◽

Distance Problem

The ancestral mixture model, an important model building a hierarchical tree from high dimensional binary sequences, was proposed by Chen and Lindsay in 2006. As a phylogenetic tree (or evolutionary tree), a mixture tree created from ancestral mixture models, involves the inferred evolutionary relationships among various biological species. Moreover, it contains the information of time when the species mutates. The tree comparison metric, an essential issue in bioinformatics, is used to measure the similarity between trees. To our knowledge, however, the approach to the comparison between two mixture trees is still unknown. In this paper, we propose a new metric named the mixture distance metric, to measure the similarity of two mixture trees. It uniquely considers the factor of evolutionary times between trees. If we convert the mixture tree that contains the information of mutation time of each internal node into a weighted tree, the mixture distance metric is very close to the weighted path difference distance metric. Since the converted mixture tree forms a special weighted tree, we were able to design a more efficient algorithm to calculate this new metric. Therefore, we developed two algorithms to compute the mixture distance between two mixture trees. One requires O(n2) and the other requires O(nh1h2) computational time with O(n) preprocessing time, where n denotes the number of leaves in the two mixture trees, and h1 and h2 denote the heights of these two trees.

Download Full-text

A Robust Forgery Detection Method for Copy–Move and Splicing Attacks in Images

Electronics ◽

10.3390/electronics9091500 ◽

2020 ◽

Vol 9 (9) ◽

pp. 1500

Author(s):

Mohammad Manzurul Islam ◽

Gour Karmakar ◽

Joarder Kamruzzaman ◽

Manzur Murshed

Keyword(s):

Model Building ◽

Performance Metrics ◽

Detection Method ◽

Color Image ◽

Computational Time ◽

Support Vector ◽

Detection Accuracy ◽

Forgery Detection ◽

Image Forgery ◽

The Mean

Internet of Things (IoT) image sensors, social media, and smartphones generate huge volumes of digital images every day. Easy availability and usability of photo editing tools have made forgery attacks, primarily splicing and copy–move attacks, effortless, causing cybercrimes to be on the rise. While several models have been proposed in the literature for detecting these attacks, the robustness of those models has not been investigated when (i) a low number of tampered images are available for model building or (ii) images from IoT sensors are distorted due to image rotation or scaling caused by unwanted or unexpected changes in sensors’ physical set-up. Moreover, further improvement in detection accuracy is needed for real-word security management systems. To address these limitations, in this paper, an innovative image forgery detection method has been proposed based on Discrete Cosine Transformation (DCT) and Local Binary Pattern (LBP) and a new feature extraction method using the mean operator. First, images are divided into non-overlapping fixed size blocks and 2D block DCT is applied to capture changes due to image forgery. Then LBP is applied to the magnitude of the DCT array to enhance forgery artifacts. Finally, the mean value of a particular cell across all LBP blocks is computed, which yields a fixed number of features and presents a more computationally efficient method. Using Support Vector Machine (SVM), the proposed method has been extensively tested on four well known publicly available gray scale and color image forgery datasets, and additionally on an IoT based image forgery dataset that we built. Experimental results reveal the superiority of our proposed method over recent state-of-the-art methods in terms of widely used performance metrics and computational time and demonstrate robustness against low availability of forged training samples.

Download Full-text

Aggregation of Radial Distribution System Bus with Volt-Var Control

Energies ◽

10.3390/en14175390 ◽

2021 ◽

Vol 14 (17) ◽

pp. 5390

Author(s):

Hiroshi Kikusato ◽

Taha Selim Ustun ◽

Dai Orihara ◽

Jun Hashimoto ◽

Kenji Otani

Keyword(s):

Distribution System ◽

Model Building ◽

Historical Data ◽

Low Voltage ◽

Computational Time ◽

Radial Distribution System ◽

High Penetration ◽

Reduction Methods ◽

Aggregated Model ◽

The Impact

The high penetration of the distributed energy resources (DERs) encourages themselves to implement grid-supporting functions, such as volt-var control. The quasi-static time-series (QSTS) simulation is an essential technique to evaluate the impact of active DERs on the grid. Meanwhile, the increase of complexity on the circuit model causes a heavy computational burden of QSTS simulation. Although circuit reduction methods have been proposed, there have been few methods that can appropriately handle the distribution system (DS) with multiple voltage control devices, such as DERs implementing volt-var control. To address the remaining issues, this paper proposes an offline bus aggregation method for DS with volt-var control. The method determines the volt-var curve for the aggregated bus on the basis of historical data to reduce error in the aggregated model, and its offline process solves the computational convergence issue concerned in the online one. The effectiveness of the proposed method is validated in the simulation using a Japanese low-voltage DS model. The simulation results show that the proposed method can reduce the voltage error and computational time. Furthermore, the versatility of the proposed method is verified to show the performance does not heavily depend on how to select historical data for model-building.

Download Full-text

Deep-learning inversion: A next-generation seismic velocity model building method

Geophysics ◽

10.1190/geo2018-0249.1 ◽

2019 ◽

Vol 84 (4) ◽

pp. R583-R599 ◽

Cited By ~ 38

Author(s):

Fangshu Yang ◽

Jianwei Ma

Keyword(s):

Deep Learning ◽

Seismic Data ◽

Model Building ◽

Seismic Velocity ◽

Velocity Model ◽

Computational Time ◽

Learning Methods ◽

Velocity Models ◽

Velocity Model Building ◽

Training Stage

Seismic velocity is one of the most important parameters used in seismic exploration. Accurate velocity models are the key prerequisites for reverse time migration and other high-resolution seismic imaging techniques. Such velocity information has traditionally been derived by tomography or full-waveform inversion (FWI), which are time consuming and computationally expensive, and they rely heavily on human interaction and quality control. We have investigated a novel method based on the supervised deep fully convolutional neural network for velocity-model building directly from raw seismograms. Unlike the conventional inversion method based on physical models, supervised deep-learning methods are based on big-data training rather than prior-knowledge assumptions. During the training stage, the network establishes a nonlinear projection from the multishot seismic data to the corresponding velocity models. During the prediction stage, the trained network can be used to estimate the velocity models from the new input seismic data. One key characteristic of the deep-learning method is that it can automatically extract multilayer useful features without the need for human-curated activities and an initial velocity setup. The data-driven method usually requires more time during the training stage, and actual predictions take less time, with only seconds needed. Therefore, the computational time of geophysical inversions, including real-time inversions, can be dramatically reduced once a good generalized network is built. By using numerical experiments on synthetic models, the promising performance of our proposed method is shown in comparison with conventional FWI even when the input data are in more realistic scenarios. We have also evaluated deep-learning methods, the training data set, the lack of low frequencies, and the advantages and disadvantages of our method.

Download Full-text

Selection of Moment Vectors in Protein Sequence Comparison Under Binary Representation

10.21203/rs.3.rs-1028526/v1 ◽

2021 ◽

Author(s):

Jayanta Pal ◽

Soumen Ghosh ◽

Bansibadan Maji ◽

Dilip Kumar Bhattacharya

Keyword(s):

Genome Sequence ◽

Protein Sequence ◽

Sequence Comparison ◽

Phylogenetic Trees ◽

Binary Sequences ◽

Computational Time ◽

Binary Representation ◽

Distance Matrices ◽

Protein Sequence Comparison ◽

Selection Of

Abstract Similarity/dissimilarity study of protein and genome sequences remains a challenging task and selection of techniques and descriptors to be adopted, plays an important role in computational biology. Again, genome sequence comparison is always preferred to protein sequence comparison due the presence of 20 amino acids in protein sequence compared to only 4 nucleotides in genome sequence. So it is important to consider suitable representation that is both time and space efficient and also equally applicable to protein sequences of equal and unequal lengths. In the binary form of representation, Fourier transform of a protein sequence reduces to the transformation of 20 simple binary sequences in Fourier domain, where in each such sequence, Perseval’s Identity gives a very simple computable form of power spectrum. This gives rise to readily acceptable forms of moments of different degrees. Again such moments, when properly normalized, show a monotonically descending trend with the increase in the degrees of the moments. So it is better to stick to moments of smaller degrees only. In this paper, descriptors are taken as 20 component vectors, where each component corresponds to a general second order moment of one of the 20 simple binary sequences. Then distance matrices are obtained by using Euclidean distance as the distance measure between each pair of sequence. Phylogenetic trees are obtained from the distance matrices using UPGMA algorithm. In the present paper, the datasets used for similarity/dissimilarity study are 9 ND4, 16 ND5, 9 ND6, 24 TF proteins and 12 Baculovirus proteins. It is found that the phylogenetic trees produced by the present method are at par with those produced by the earlier methods adopted by other authors and also their known biological references. Further it takes less computational time and also it is equally applicable to sequences of equal and unequal lengths.

Download Full-text

Detecting SIM Box Fraud by Using Support Vector Machine and Artificial Neural Network

Jurnal Teknologi ◽

10.11113/jt.v74.2649 ◽

2015 ◽

Vol 74 (1) ◽

Author(s):

Roselina Sallehuddin ◽

Subariah Ibrahim ◽

Azlan Mohd Zain ◽

Abdikarim Hussein Elmi

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Support Vector Machine ◽

Model Building ◽

Computational Time ◽

Support Vector ◽

Ann Model ◽

Svm Model ◽

Artificial Neural ◽

Artificial Neural Network Ann

Fraud in communication has been increasing dramatically due to the new modern technologies and the global superhighways of communication, resulting in loss of revenues and quality of service in telecommunication providers especially in Africa and Asia. One of the dominant types of fraud is SIM box bypass fraud whereby SIM cards are used to channel national and multinational calls away from mobile operators and deliver as local calls. Therefore it is important to find techniques that can detect this type of fraud efficiently. In this paper, two classification techniques, Artificial Neural Network (ANN) and Support Vector Machine (SVM) were developed to detect this type of fraud. The classification uses nine selected features of data extracted from Customer Database Record. The performance of ANN is compared with SVM to find which model gives the best performance. From the experiments, it is found that SVM model gives higher accuracy compared to ANN by giving the classification accuracy of 99.06% compared with ANN model, 98.71% accuracy. Besides, better accuracy performance, SVM also requires less computational time compared to ANN since it takes lesser amount of time in model building and training.

Download Full-text

Velocity-independent estimation of kinematic attributes in vertical transverse isotropy media using local slopes and predictive painting

Geophysics ◽

10.1190/geo2015-0638.1 ◽

2016 ◽

Vol 81 (5) ◽

pp. U73-U85 ◽

Cited By ~ 9

Author(s):

M. Javad Khoshnavaz ◽

Andrej Bóna ◽

Milovan Urosevic

Keyword(s):

Model Building ◽

Seismic Velocity ◽

Transverse Isotropy ◽

Velocity Model ◽

Velocity Estimation ◽

Computational Time ◽

Sufficient Information ◽

Zero Offset ◽

Vertical Transverse Isotropy ◽

Vti Media

Agood seismic velocity model is required for many routine seismic imaging techniques. Velocity model building from seismic data is often labor intensive and time consuming. The process becomes more complicated by taking nonhyperbolic traveltime estimations into account. An alternative to the conventional time-domain imaging algorithms is to use techniques based on the local event slopes, which contain sufficient information about the traveltime moveout for velocity estimation and characterization of the subsurface geologic structures. Given the local slopes, there is no need for a prior knowledge of a velocity model. That is why the term “velocity independent” is commonly used for such techniques. We improved upon and simplified the previous versions of velocity-independent nonhyperbolic approximations for horizontally layered vertical transverse isotropy (VTI) media by removing one order of differentiation with respect to offset from the imaging kinematic attributes. These kinematic attributes are derived in terms of the local event slopes and zero-offset two-way traveltime (TWTT). We proposed the use of predictive painting, which keeps all the attributes curvature independent, to estimate the zero-offset TWTT. The theoretical contents and performance of the proposed approach were evaluated on synthetic and field data examples. We also studied the accuracy of moveout attributes for shifted hyperbola, rational, three-parameter, and acceleration approximations on a synthetic example. Our results show that regardless of the approximation types, NMO velocity estimate has higher accuracy than the nonhyperbolicity attribute. Computational time and accuracy of the inversion of kinematic attributes in VTI media using our approach were compared with routine/conventional multiparameter semblance inversion and with the previous velocity-independent inversion techniques.

Download Full-text

Hybrid Genetic Clustering by Using FCM and Geodesic Distance for Complex Distributed Data

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.263-266.2597 ◽

2012 ◽

Vol 263-266 ◽

pp. 2597-2601 ◽

Cited By ~ 1

Author(s):

Yong Sheng Yang ◽

Gang Li ◽

Yong Sheng Zhu ◽

You Yun Zhang

Keyword(s):

Clustering Algorithm ◽

Geodesic Distance ◽

Computational Time ◽

Distributed Data ◽

Distance Metric ◽

Fitness Evaluation ◽

Benchmark Datasets ◽

Fuzzy C Means Clustering ◽

Fcm Clustering ◽

Genetic Clustering

To efficiently find hidden clusters in datasets with complex distributed data，inspired by complementary strategies, a hybrid genetic clustering algorithm was developed, which is on the basis of the geodesic distance metric, and combined with the Fuzzy C-Means clustering (FCM) algorithm. First, instead of using Euclidean distance，the new approach employs geodesic distance based dissimilarity metric during all fitness evaluation. And then, with the help of FCM clustering, some sub-clusters with spherical distribution are partitioned effectively. Next, a genetic algorithm based clustering using geodesic distance metric, named GCGD, is adopted to cluster the clustering centers obtained from FCM clustering. Finally, the final results are acquired based on above two clustering results. Experimental results on eight benchmark datasets clustering questions show the effectiveness of the algorithm as a clustering technique. Compared with conventional GCGD, the hybrid clustering can decrease the computational time obviously, while retaining high clustering correct ratio.

Download Full-text

Symmetric Multistep Methods Revisited: II. Numerical Experiments

International Astronomical Union Colloquium ◽

10.1017/s0252921100031602 ◽

1999 ◽

Vol 173 ◽

pp. 309-314 ◽

Cited By ~ 3

Author(s):

T. Fukushima

Keyword(s):

Multistep Methods ◽

Computational Time ◽

Integration Time ◽

Implicit Methods ◽

Symmetric Methods ◽

Celestial Bodies ◽

The Stability ◽

Predictor Corrector ◽

Symmetric Multistep Methods ◽

Integration Errors

AbstractBy using the stability condition and general formulas developed by Fukushima (1998 = Paper I) we discovered that, just as in the case of the explicit symmetric multistep methods (Quinlan and Tremaine, 1990), when integrating orbital motions of celestial bodies, the implicit symmetric multistep methods used in the predictor-corrector manner lead to integration errors in position which grow linearly with the integration time if the stepsizes adopted are sufficiently small and if the number of corrections is sufficiently large, say two or three. We confirmed also that the symmetric methods (explicit or implicit) would produce the stepsize-dependent instabilities/resonances, which was discovered by A. Toomre in 1991 and confirmed by G.D. Quinlan for some high order explicit methods. Although the implicit methods require twice or more computational time for the same stepsize than the explicit symmetric ones do, they seem to be preferable since they reduce these undesirable features significantly.

Download Full-text

A TEM study of electron tunneling in biological macromolecules

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100125634 ◽

1987 ◽

Vol 45 ◽

pp. 140-141

Author(s):

J. A. Panitz

Keyword(s):

Stark Effect ◽

Electron Tunneling ◽

Imaging Modality ◽

Tunneling Current ◽

Biological Species ◽

Scanning Tunneling ◽

Biological Macromolecules ◽

Flat Surfaces ◽

Virus Particles ◽

Stm Images

Tunneling is a ubiquitous phenomenon. Alpha particle disintegration, the Stark effect, superconductivity in thin films, field-emission, and field-ionization are examples of electron tunneling phenomena. In the scanning tunneling microscope (STM) electron tunneling is used as an imaging modality. STM images of flat surfaces show structure at the atomic level. However, STM images of large biological species deposited onto flat surfaces are disappointing. For example, unstained virus particles imaged in the STM do not resemble their TEM counterparts.It is not clear how an STM image of a biological species is formed. Most biological species are large compared to the nominal electrode separation of ∼ 1nm that is required for electron tunneling. To form an image of a biological species, the tunneling electrodes must be separated by a distance that would normally be too large for a tunneling current to be observed.

Download Full-text

Observation of surface morphology by reflection electron holography

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100154652 ◽

1989 ◽

Vol 47 ◽

pp. 536-537

Author(s):

N. Osakabe ◽

J. Endo ◽

T. Matsuda ◽

A. Tonomura

Keyword(s):

Phase Shift ◽

Surface Morphology ◽

High Sensitivity ◽

High Energy ◽

Path Difference ◽

Transmission Mode ◽

Electron Holography ◽

Dynamical Process ◽

High Vertical Resolution ◽

Amplification Technique

Progress in microscopy such as STM and TEM-TED has revealed surface structures in atomic dimension. REM has been used for the observation of surface dynamical process and surface morphology. Recently developed reflection electron holography, which employes REM optics to measure the phase shift of reflected electron, has been proved to be effective for the observation of surface morphology in high vertical resolution ≃ 0.01 Å.The key to the high sensitivity of the method is best shown by comparing the phase shift generation by surface topography with that in transmission mode. Difference in refractive index between vacuum and material Vo/2E≃10-4 owes the phase shift in transmission mode as shownn Fig. 1( a). While geometrical path difference is created in reflection mode( Fig. 1(b) ), which is measured interferometrically using high energy electron beam of wavelength ≃0.01 Å. Together with the phase amplification technique , the vertivcal resolution is expected to be ≤0.01 Å in an ideal case.

Download Full-text