Economic Predictions With Big Data: The Illusion of Sparsity

Domenico Giannone; Michele Lenza; Giorgio E. Primiceri

doi:10.3982/ecta17842

Economic Predictions With Big Data: The Illusion of Sparsity

Econometrica ◽

10.3982/ecta17842 ◽

2021 ◽

Vol 89 (5) ◽

pp. 2409-2437 ◽

Cited By ~ 1

Author(s):

Domenico Giannone ◽

Michele Lenza ◽

Giorgio E. Primiceri

Keyword(s):

Big Data ◽

Variable Selection ◽

Posterior Distribution ◽

Predictive Models ◽

Sparse Model

We compare sparse and dense representations of predictive models in macroeconomics, microeconomics, and finance. To deal with a large number of possible predictors, we specify a prior that allows for both variable selection and shrinkage. The posterior distribution does not typically concentrate on a single sparse model, but on a wide set of models that often include many predictors.

Download Full-text

Big Data in Sports: Predictive Models for Basketball Player's Performance

10.33774/miir-2021-h4x62 ◽

2021 ◽

Author(s):

Dae-Jin Lee ◽

Garritt L. Page

Keyword(s):

Big Data ◽

Predictive Models

Download Full-text

A simple new approach to variable selection in regression, with application to genetic fine-mapping

10.1101/501114 ◽

2018 ◽

Cited By ~ 33

Author(s):

Gao Wang ◽

Abhishek Sarkar ◽

Peter Carbonetto ◽

Matthew Stephens

Keyword(s):

Variable Selection ◽

Fine Mapping ◽

Posterior Distribution ◽

Zero Element ◽

Variational Approximation ◽

New Approach ◽

Stepwise Selection ◽

Fitting Procedure ◽

Highly Correlated ◽

Credible Set

We introduce a simple new approach to variable selection in linear regression, with a particular focus on quantifying uncertainty in which variables should be selected. The approach is based on a new model — the “Sum of Single Effects” (SuSiE) model — which comes from writing the sparse vector of regression coefficients as a sum of “single-effect” vectors, each with one non-zero element. We also introduce a corresponding new fitting procedure — Iterative Bayesian Stepwise Selection (IBSS) — which is a Bayesian analogue of stepwise selection methods. IBSS shares the computational simplicity and speed of traditional stepwise methods, but instead of selecting a single variable at each step, IBSS computes a distribution on variables that captures uncertainty in which variable to select. We provide a formal justification of this intuitive algorithm by showing that it optimizes a variational approximation to the posterior distribution under the SuSiE model. Further, this approximate posterior distribution naturally yields convenient novel summaries of uncertainty in variable selection, providing a Credible Set of variables for each selection. Our methods are particularly well-suited to settings where variables are highly correlated and detectable effects are sparse, both of which are characteristics of genetic fine-mapping applications. We demonstrate through numerical experiments that our methods outper-form existing methods for this task, and illustrate their application to fine-mapping genetic variants influencing alternative splicing in human cell-lines. We also discuss the potential and challenges for applying these methods to generic variable selection problems.

Download Full-text

Predictive Models on Early Detection of Mental Health Problems using Big Data and Artifical Intelligence

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2019.5555 ◽

2019 ◽

Vol 7 (5) ◽

pp. 3377-3383

Author(s):

Dhruvesh Shah

Keyword(s):

Mental Health ◽

Big Data ◽

Early Detection ◽

Predictive Models ◽

Mental Health Problems ◽

Health Problems ◽

Artifical Intelligence

Download Full-text

Computational Systems Biology Perspective on Tuberculosis in Big Data Era

Web Services ◽

10.4018/978-1-5225-7501-6.ch115 ◽

2019 ◽

pp. 2230-2254

Author(s):

Amandeep Kaur Kahlon ◽

Ashok Sharma

Keyword(s):

Big Data ◽

Systems Biology ◽

Data Management ◽

High Throughput ◽

Predictive Models ◽

System Biology ◽

Prediction Models ◽

Computational Systems Biology ◽

Future Directions ◽

Computational Systems

The major concern in this chapter is to understand the need of system biology in prediction models in studying tuberculosis infection in the big data era. The overall complexity of biological phenomenon, such as biochemical, biophysical, and other molecular processes, within pathogen as well as their interaction with host is studied through system biology approaches. First, consideration is given to the necessity of prediction models integrating system biology approaches and later on for their replacement and refinement using high throughput data. Various ongoing projects, consortium, databases, and research groups involved in tuberculosis eradication are also discussed. This chapter provides a brief account of TB predictive models and their importance in system biology to study tuberculosis and host-pathogen interactions. This chapter also addresses big data resources and applications, data management, limitations, challenges, solutions, and future directions.

Download Full-text

Proposal of Analytical Model for Business Problems Solving in Big Data Environment

Web Services ◽

10.4018/978-1-5225-7501-6.ch034 ◽

2019 ◽

pp. 618-638

Author(s):

Goran Klepac ◽

Kristi L. Berg

Keyword(s):

Data Mining ◽

Big Data ◽

Predictive Models ◽

Analytical Approach ◽

Fraud Detection ◽

Analytical Techniques ◽

Data Sources ◽

Business Decisions ◽

Mining Projects ◽

Structured Approach

This chapter proposes a new analytical approach that consolidates the traditional analytical approach for solving problems such as churn detection, fraud detection, building predictive models, segmentation modeling with data sources, and analytical techniques from the big data area. Presented are solutions offering a structured approach for the integration of different concepts into one, which helps analysts as well as managers to use potentials from different areas in a systematic way. By using this concept, companies have the opportunity to introduce big data potential in everyday data mining projects. As is visible from the chapter, neglecting big data potentials results often with incomplete analytical results, which imply incomplete information for business decisions and can imply bad business decisions. The chapter also provides suggestions on how to recognize useful data sources from the big data area and how to analyze them along with traditional data sources for achieving more qualitative information for business decisions.

Download Full-text

Big data biology-based predictive models Via DNA-metagenomics binning for WMD events applications

2015 IEEE International Symposium on Technologies for Homeland Security (HST) ◽

10.1109/ths.2015.7225313 ◽

2015 ◽

Cited By ~ 1

Author(s):

Helal Saghir ◽

Dalila B. Megherbi

Keyword(s):

Big Data ◽

Predictive Models

Download Full-text

SP-0006: Incorporating radiomic parameters into predictive models: methods for variable selection

Radiotherapy and Oncology ◽

10.1016/s0167-8140(18)30317-7 ◽

2018 ◽

Vol 127 ◽

pp. S2

Author(s):

G. Defraene

Keyword(s):

Variable Selection ◽

Predictive Models

Download Full-text

PCN258 APPLYING BIG DATA ANALYTICS TO DEVELOP PREDICTIVE MODELS FOR ESTIMATING HER2+ BC INCIDENCE AND PREVALENCE IN BRAZIL

Value in Health ◽

10.1016/j.jval.2019.09.453 ◽

2019 ◽

Vol 22 ◽

pp. S485-S486

Author(s):

R. Rol ◽

M. França ◽

E. Bello

Keyword(s):

Big Data ◽

Predictive Models ◽

Data Analytics ◽

Big Data Analytics

Download Full-text

Oncology providers, patients, and caregivers reflect: Applying "big data" for better supportive outcomes.

Journal of Clinical Oncology ◽

10.1200/jco.2017.35.31_suppl.59 ◽

2017 ◽

Vol 35 (31_suppl) ◽

pp. 59-59

Author(s):

Marie C. Haverfield ◽

Adam Singer ◽

Karl Lorenz

Keyword(s):

Big Data ◽

Predictive Models ◽

Chronic Illnesses ◽

Communication Strategy ◽

Prognostic Information ◽

Provider Group ◽

Oncology Practice ◽

Practice Methods ◽

Oncology Providers ◽

Family Centered

59 Background: The development of “big data” methods offers an opportunity to more precisely predict patient outcomes. We explored physicians, patients, and caregivers’ perspectives about the use of predictive models in oncology practice. Methods: We conducted 12 patient, 12 provider, and 12 caregiver interviews (N = 36) from Stanford University outpatient oncology clinics. We queried participants about patient and family-centered applications of predictive models for prognosis, cost, and novel patient and family-centered outcomes. Two trained coders iteratively examined transcripts for consistent topics and used the constant comparative methods to establish themes and sub-themes. Results: Several overlapping themes emerged: 1) Outcomes of Interest, [provider] “Predictive information about side effects or adverse effects of treatment would be helpful”: 2) Barriers to Using Predictions, [patient] “If it seems too sort of set in stone, without…you know, everything has grey areas”; 3) Benefits to Using Predictions, [provider] “Some people…their cancer may be cured, but they live with these really horrible chronic illnesses and some people would say, ‘I would have rather have just died from my disease than be in this shape’; and 4) Communication Strategy, [provider] “I’m not even sure if I would bring up the models…I would kind of fall back on what I normally discuss with patients”. A theme specific to the provider group was 5) Accuracy of Model Information, [provider] “It’s hard to know whether to use in the clinical setting just the results of the model or whether you would really want to go down to the root level and actually access the raw data”. A theme specific to the patient and caregiver groups was 6) Privacy, [caregiver] “I would want to be able to have the patient authorize that”. Conclusions: There is consistency between provider strategies to communicate prognostic information and patients’ perceptions of how they would like prognostic information to be communicated to them. While providers are concerned with accuracy of predictive models, patients and caregivers are more concerned with privacy.

Download Full-text

Big Data Governance in Agile and Data-Driven Software Development

Big Data Governance and Perspectives in Knowledge Management - Advances in Knowledge Acquisition, Transfer, and Management ◽

10.4018/978-1-5225-7077-6.ch008 ◽

2019 ◽

pp. 179-199

Author(s):

Lili Aunimo ◽

Ari V. Alamäki ◽

Harri Ketamo

Keyword(s):

Big Data ◽

Software Development ◽

Predictive Models ◽

Data Privacy ◽

Business Case ◽

Business Environment ◽

Data Driven ◽

Data Governance ◽

Governance Framework ◽

A Company

Constructing a big data governance framework is important when a company performs data-driven software development. The most important aspects of big data governance are data privacy, security, availability, usability, and integrity. In this chapter, the authors present a business case where a framework for big data governance has been built. The business case is about the development and continuous improvement of a new mobile application that is targeted for consumers. In this context, big data is used in product development, in building predictive modes related to the users and for personalization of the product. The main finding of the study is a novel big data governance framework and that a proper framework for big data governance is useful when building and maintaining trustworthy and value adding big data-driven predictive models in an authentic business environment.

Download Full-text