GPS: Working Backward with Data

2020 ◽  
Vol 113 (3) ◽  
pp. 257-259
Author(s):  
Matt Enlow ◽  
S. Asli Özgün-Koca

This month's Growing Problem Solvers focuses on Data Analysis across all grades beginning with visual representations of categorical data and moving to measures of central tendency using a “working backwards” approach.

2020 ◽  
Vol 13 (5) ◽  
pp. 1020-1030
Author(s):  
Pradeep S. ◽  
Jagadish S. Kallimani

Background: With the advent of data analysis and machine learning, there is a growing impetus of analyzing and generating models on historic data. The data comes in numerous forms and shapes with an abundance of challenges. The most sorted form of data for analysis is the numerical data. With the plethora of algorithms and tools it is quite manageable to deal with such data. Another form of data is of categorical nature, which is subdivided into, ordinal (order wise) and nominal (number wise). This data can be broadly classified as Sequential and Non-Sequential. Sequential data analysis is easier to preprocess using algorithms. Objective: The challenge of applying machine learning algorithms on categorical data of nonsequential nature is dealt in this paper. Methods: Upon implementing several data analysis algorithms on such data, we end up getting a biased result, which makes it impossible to generate a reliable predictive model. In this paper, we will address this problem by walking through a handful of techniques which during our research helped us in dealing with a large categorical data of non-sequential nature. In subsequent sections, we will discuss the possible implementable solutions and shortfalls of these techniques. Results: The methods are applied to sample datasets available in public domain and the results with respect to accuracy of classification are satisfactory. Conclusion: The best pre-processing technique we observed in our research is one hot encoding, which facilitates breaking down the categorical features into binary and feeding it into an Algorithm to predict the outcome. The example that we took is not abstract but it is a real – time production services dataset, which had many complex variations of categorical features. Our Future work includes creating a robust model on such data and deploying it into industry standard applications.


1986 ◽  
Vol 62 (2) ◽  
pp. 192 ◽  
Author(s):  
Joel L. Horowitz ◽  
Neil Wrigley

Author(s):  
Martina Zámková ◽  
Martin Prokop ◽  
Radek Stolín

Our paper explores the factors influencing the consumers who buy organic food. Analysis of these factors enabled us to sort the consumers into groups based on their gender, age, education, and other identifiers. Further research then revealed more detailed shopping preferences of each one of those groups. The findings generated recommendations for producers and organic produce vendors on the best way to provide target marketing for different groups of consumers and therefore increase their sales of organic produce and food made from organic produce. Considering the use of categorical data, contingency tables and correspondence maps served as the best representation and processing tools. Data analysis showed that organic produce is most frequently purchased by respondents in the age of 45+ years, who also tend to spend more money for this range of products. At the same time, these would be the respondents, who struggle the most when recognizing organic produce and who have often never seen any advertisement for it. The respondents aged 25 years and less tend to purchase organic produce least frequently; they also often do not care about the origin of organic produce. Almost the same applies to families with multiple children. However, young respondents often grow their own organic produce. There is still a not insignificant percentage of consumers, who consider organic produce to be expensive and who do not believe in their qualities. As it turns out, when it comes to organic produce the respondents purchase most frequently fruits and vegetables, milk and dairy products.


Sign in / Sign up

Export Citation Format

Share Document