Fitting cross-classification table data to models when observations are subject to classification error

Psychometrika ◽  
1977 ◽  
Vol 42 (2) ◽  
pp. 199-206 ◽  
Author(s):  
Hoben Thomas
Author(s):  
Shinya Kikuchi ◽  
Jongho Rhee

Trip-production rates presented in cross-classification tables are essential data for the planner’s understanding of the travel characteristics of a region. Trip rates obtained from surveys, however, often show a pattern that is not consistent with what is expected by the analyst; for example, the greater the household size and auto ownership, the greater the number of trips generated. This pattern may not be found in the trip rates that are obtained directly by the survey. In such cases, analysts commonly adjust the irregularities manually. The way in which the values are adjusted affects the credibility of the trip table and, ultimately, the forecast travel demand. A method that adjusts the values of the trip table systematically is presented. The process uses the fuzzy linear programming method. The objective is to make the adjusted value as close to the observed value as possible. The constraints are to make the adjusted values adhere to the analyst’s general expectations about the pattern of the values in the table, and to match the number of trips estimated from the adjusted trip table with the actual number of trips surveyed. An application example that uses real-world data is given.


2007 ◽  
Vol 37 (1) ◽  
pp. 1-22 ◽  
Author(s):  
Leo A. Goodman

Consider an m-way cross-classification table (for m = 3, 4, …) of m dichotomous variables that describes (1) the 2m possible response patterns to a set of m questions (where the response to each question is binary), and (2) the number of individuals whose responses to the m questions can be described by a particular response pattern, for each of the 2m possible response patterns. Consider the situation where the data in the cross-classification table are analyzed using a particular latent class model having T latent classes (for T = 2, 3, …), and where this model fits the data well. With this latent class model, it is possible to estimate, for an individual who has a particular response pattern, what is the conditional probability that this individual is in a particular latent class, for each of the T latent classes. In this article, the following question is considered: For an individual who has a particular response pattern, can we use the corresponding estimated conditional probabilities to assign this individual to one of the T latent classes? Two different assignment procedures are considered here, and for each of these procedures, two different criteria are introduced to help assess when the assignment procedure is satisfactory and when it is not. In addition, we describe here the particular framework and context in which the two assignment procedures, and the two criteria, are considered. For illustrative purposes, the latent class analysis of a classic set of data, a four-way cross-classification of some survey data, obtained in a two-wave panel study, is discussed; and the two different criteria introduced herein are applied in this analysis to each of the two assignment procedures.


Cybersecurity ◽  
2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Shushan Arakelyan ◽  
Sima Arasteh ◽  
Christophe Hauser ◽  
Erik Kline ◽  
Aram Galstyan

AbstractTackling binary program analysis problems has traditionally implied manually defining rules and heuristics, a tedious and time consuming task for human analysts. In order to improve automation and scalability, we propose an alternative direction based on distributed representations of binary programs with applicability to a number of downstream tasks. We introduce Bin2vec, a new approach leveraging Graph Convolutional Networks (GCN) along with computational program graphs in order to learn a high dimensional representation of binary executable programs. We demonstrate the versatility of this approach by using our representations to solve two semantically different binary analysis tasks – functional algorithm classification and vulnerability discovery. We compare the proposed approach to our own strong baseline as well as published results, and demonstrate improvement over state-of-the-art methods for both tasks. We evaluated Bin2vec on 49191 binaries for the functional algorithm classification task, and on 30 different CWE-IDs including at least 100 CVE entries each for the vulnerability discovery task. We set a new state-of-the-art result by reducing the classification error by 40% compared to the source-code based inst2vec approach, while working on binary code. For almost every vulnerability class in our dataset, our prediction accuracy is over 80% (and over 90% in multiple classes).


Author(s):  
Leijin Long ◽  
Feng He ◽  
Hongjiang Liu

AbstractIn order to monitor the high-level landslides frequently occurring in Jinsha River area of Southwest China, and protect the lives and property safety of people in mountainous areas, the data of satellite remote sensing images are combined with various factors inducing landslides and transformed into landslide influence factors, which provides data basis for the establishment of landslide detection model. Then, based on the deep belief networks (DBN) and convolutional neural network (CNN) algorithm, two landslide detection models DBN and convolutional neural-deep belief network (CDN) are established to monitor the high-level landslide in Jinsha River. The influence of the model parameters on the landslide detection results is analyzed, and the accuracy of DBN and CDN models in dealing with actual landslide problems is compared. The results show that when the number of neurons in the DBN is 100, the overall error is the minimum, and when the number of learning layers is 3, the classification error is the minimum. The detection accuracy of DBN and CDN is 97.56% and 97.63%, respectively, which indicates that both DBN and CDN models are feasible in dealing with landslides from remote sensing images. This exploration provides a reference for the study of high-level landslide disasters in Jinsha River.


2021 ◽  
Vol 11 (9) ◽  
pp. 4292
Author(s):  
Mónica Y. Moreno-Revelo ◽  
Lorena Guachi-Guachi ◽  
Juan Bernardo Gómez-Mendoza ◽  
Javier Revelo-Fuelagán ◽  
Diego H. Peluffo-Ordóñez

Automatic crop identification and monitoring is a key element in enhancing food production processes as well as diminishing the related environmental impact. Although several efficient deep learning techniques have emerged in the field of multispectral imagery analysis, the crop classification problem still needs more accurate solutions. This work introduces a competitive methodology for crop classification from multispectral satellite imagery mainly using an enhanced 2D convolutional neural network (2D-CNN) designed at a smaller-scale architecture, as well as a novel post-processing step. The proposed methodology contains four steps: image stacking, patch extraction, classification model design (based on a 2D-CNN architecture), and post-processing. First, the images are stacked to increase the number of features. Second, the input images are split into patches and fed into the 2D-CNN model. Then, the 2D-CNN model is constructed within a small-scale framework, and properly trained to recognize 10 different types of crops. Finally, a post-processing step is performed in order to reduce the classification error caused by lower-spatial-resolution images. Experiments were carried over the so-named Campo Verde database, which consists of a set of satellite images captured by Landsat and Sentinel satellites from the municipality of Campo Verde, Brazil. In contrast to the maximum accuracy values reached by remarkable works reported in the literature (amounting to an overall accuracy of about 81%, a f1 score of 75.89%, and average accuracy of 73.35%), the proposed methodology achieves a competitive overall accuracy of 81.20%, a f1 score of 75.89%, and an average accuracy of 88.72% when classifying 10 different crops, while ensuring an adequate trade-off between the number of multiply-accumulate operations (MACs) and accuracy. Furthermore, given its ability to effectively classify patches from two image sequences, this methodology may result appealing for other real-world applications, such as the classification of urban materials.


2008 ◽  
Vol 238 (1) ◽  
pp. 170-177 ◽  
Author(s):  
D.K. Chandraker ◽  
P.K. Vijayan ◽  
D. Saha ◽  
R.K. Sinha

Sign in / Sign up

Export Citation Format

Share Document