Data-Driven Optimization for Commodity Procurement Under Price Uncertainty

Author(s):  
Christian Mandl ◽  
Stefan Minner

Problem definition: We study a practice-motivated multiperiod stochastic commodity procurement problem under price uncertainty with forward and spot purchase options. Existing approaches are based on parametric price models, which inevitably involve price model misspecification and generalization error. Academic/practical relevance: We propose a nonparametric, data-driven approach (DDA) that is consistent with the optimal procurement policy structure but without requiring the a priori specification and estimation of stochastic price processes. In addition to historical prices, DDA is able to leverage real-time feature data, such as economic indicators, in solving the problem. Methodology: This paper provides a framework for prescriptive analytics in dynamic commodity procurement, with optimal purchase policies learned directly from data as functions of features, via mixed integer linear programming (MILP) under cost minimization objectives. Hence, DDA focuses on optimal decisions rather than optimal predictions. Furthermore, we combine optimization with regularization from machine learning (ML) to extract decision-relevant data from noise. Results: Based on numerical experiments and empirical data, we show that there is a significant value of feature data for commodity procurement when procurement policy parameters are learned as functions of features. However, overfitting deteriorates the performance of data-driven solutions, which asks for ML extensions to improve out-of-sample generalization. Compared with an internal best-practice benchmark, DDA generates savings of on average 9.1 million euros per annum (4.33%) for 10 years of backtesting. Managerial implications: A practical benefit of DDA is that it yields simple but optimally structured decision rules that are easy to interpret and easy to operationalize. Furthermore, DDA is generalizable and applicable to many other procurement settings.

Author(s):  
Laure Fournier ◽  
Lena Costaridou ◽  
Luc Bidaut ◽  
Nicolas Michoux ◽  
Frederic E. Lecouvet ◽  
...  

Abstract Existing quantitative imaging biomarkers (QIBs) are associated with known biological tissue characteristics and follow a well-understood path of technical, biological and clinical validation before incorporation into clinical trials. In radiomics, novel data-driven processes extract numerous visually imperceptible statistical features from the imaging data with no a priori assumptions on their correlation with biological processes. The selection of relevant features (radiomic signature) and incorporation into clinical trials therefore requires additional considerations to ensure meaningful imaging endpoints. Also, the number of radiomic features tested means that power calculations would result in sample sizes impossible to achieve within clinical trials. This article examines how the process of standardising and validating data-driven imaging biomarkers differs from those based on biological associations. Radiomic signatures are best developed initially on datasets that represent diversity of acquisition protocols as well as diversity of disease and of normal findings, rather than within clinical trials with standardised and optimised protocols as this would risk the selection of radiomic features being linked to the imaging process rather than the pathology. Normalisation through discretisation and feature harmonisation are essential pre-processing steps. Biological correlation may be performed after the technical and clinical validity of a radiomic signature is established, but is not mandatory. Feature selection may be part of discovery within a radiomics-specific trial or represent exploratory endpoints within an established trial; a previously validated radiomic signature may even be used as a primary/secondary endpoint, particularly if associations are demonstrated with specific biological processes and pathways being targeted within clinical trials. Key Points • Data-driven processes like radiomics risk false discoveries due to high-dimensionality of the dataset compared to sample size, making adequate diversity of the data, cross-validation and external validation essential to mitigate the risks of spurious associations and overfitting. • Use of radiomic signatures within clinical trials requires multistep standardisation of image acquisition, image analysis and data mining processes. • Biological correlation may be established after clinical validation but is not mandatory.


Geosciences ◽  
2021 ◽  
Vol 11 (2) ◽  
pp. 99 ◽  
Author(s):  
Yueqi Gu ◽  
Orhun Aydin ◽  
Jacqueline Sosa

Post-earthquake relief zone planning is a multidisciplinary optimization problem, which required delineating zones that seek to minimize the loss of life and property. In this study, we offer an end-to-end workflow to define relief zone suitability and equitable relief service zones for Los Angeles (LA) County. In particular, we address the impact of a tsunami in the study due to LA’s high spatial complexities in terms of clustering of population along the coastline, and a complicated inland fault system. We design data-driven earthquake relief zones with a wide variety of inputs, including geological features, population, and public safety. Data-driven zones were generated by solving the p-median problem with the Teitz–Bart algorithm without any a priori knowledge of optimal relief zones. We define the metrics to determine the optimal number of relief zones as a part of the proposed workflow. Finally, we measure the impacts of a tsunami in LA County by comparing data-driven relief zone maps for a case with a tsunami and a case without a tsunami. Our results show that the impact of the tsunami on the relief zones can extend up to 160 km inland from the study area.


SIMULATION ◽  
2020 ◽  
Vol 96 (8) ◽  
pp. 641-653
Author(s):  
Jonathan Larson ◽  
Paul Isihara ◽  
Gabriel Flores ◽  
Edwin Townsend ◽  
Danilo R. Diedrichs ◽  
...  

The United Nations Office for the Coordination of Humanitarian Affairs has asserted that risks in deployment of unmanned aerial vehicles (UAVs) within disaster response must be reduced by careful development of best-practice standards before implementing such systems. With recent humanitarian field tests of cargo UAVs as indication that implementation may soon become reality, a priori assessment of a smart-navigated (autonomous) UAV disaster cargo fleet via simulation modeling and analysis is vital to the best-practice development process. Logistical problems with ground transport of relief supplies in Puerto Rico after Hurricane Maria (2017) pose a compelling use scenario for UAV disaster cargo delivery. In this context, we introduce a General Purpose Assessment Model (GPAM) that can estimate the potential effectiveness of a cargo UAV fleet for any given response region. We evaluate this model using the following standards: (i) realistic specifications; (ii) stable output for various realistic specifications; and (iii) support of humanitarian goals. To this end, we discuss data from humanitarian cargo delivery field tests and feedback from practitioners, perform sensitivity analyses, and demonstrate the advantage of using humanitarian rather than geographic distance in making fleet delivery assignments. We conclude with several major challenges faced by those who wish to implement smart-navigated UAV cargo fleets in disaster response, and the need for further GPAM development. This paper proposes the GPAM as a useful simulation tool to encourage and guide steps toward humanitarian use of UAVs for cargo delivery. The model’s flexibility can allow organizations to quickly and effectively determine how best to respond to disasters.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Anton Ochoa Bique ◽  
Leonardo K. K. Maia ◽  
Ignacio E. Grossmann ◽  
Edwin Zondervan

Abstract A strategy for the design of a hydrogen supply chain (HSC) network in Germany incorporating the uncertainty in the hydrogen demand is proposed. Based on univariate sensitivity analysis, uncertainty in hydrogen demand has a very strong impact on the overall system costs. Therefore we consider a scenario tree for a stochastic mixed integer linear programming model that incorporates the uncertainty in the hydrogen demand. The model consists of two configurations, which are analyzed and compared to each other according to production types: water electrolysis versus steam methane reforming. Each configuration has a cost minimization target. The concept of value of stochastic solution (VSS) is used to evaluate the stochastic optimization results and compare them to their deterministic counterpart. The VSS of each configuration shows significant benefits of a stochastic optimization approach for the model presented in this study, corresponding up to 26% of infrastructure investments savings.


Energy ◽  
2017 ◽  
Vol 122 ◽  
pp. 182-193 ◽  
Author(s):  
Ali Esmaeily ◽  
Abdollah Ahmadi ◽  
Fatima Raeisi ◽  
Mohammad Reza Ahmadi ◽  
Ali Esmaeel Nezhad ◽  
...  

2021 ◽  
Author(s):  
Geza Halasz ◽  
Michela Sperti ◽  
Matteo Villani ◽  
Umberto Michelucci ◽  
Piergiuseppe Agostoni ◽  
...  

BACKGROUND Several models have been developed to predict mortality in patients with Covid-19 pneumonia, but only few have demonstrated enough discriminatory capacity. Machine-learning algorithms represent a novel approach for data-driven prediction of clinical outcomes with advantages over statistical modelling. OBJECTIVE To developed the Piacenza score, a Machine-learning based score, to predict 30-day mortality in patients with Covid-19 pneumonia METHODS The study comprised 852 patients with COVID-19 pneumonia, admitted to the Guglielmo da Saliceto Hospital (Italy) from February to November 2020. The patients’ medical history, demographic and clinical data were collected in an electronic health records. The overall patient dataset was randomly splitted into derivation and test cohort. The score was obtained through the Naïve Bayes classifier and externally validated on 86 patients admitted to Centro Cardiologico Monzino (Italy) in February 2020. Using a forward-search algorithm six features were identified: age; mean corpuscular haemoglobin concentration; PaO2/FiO2 ratio; temperature; previous stroke; gender. The Brier index was used to evaluate the ability of ML to stratify and predict observed outcomes. A user-friendly web site available at (https://covid.7hc.tech.) was designed and developed to enable a fast and easy use of the tool by the final user (i.e., the physician). Regarding the customization properties to the Piacenza score, we added a personalized version of the algorithm inside the website, which enables an optimized computation of the mortality risk score for a single patient, when some variables used by the Piacenza score are not available. In this case, the Naïve Bayes classifier is re-trained over the same derivation cohort but using a different set of patient’s characteristics. We also compared the Piacenza score with the 4C score and with a Naïve Bayes algorithm with 14 features chosen a-priori. RESULTS The Piacenza score showed an AUC of 0.78(95% CI 0.74-0.84 Brier-score 0.19) in the internal validation cohort and 0.79(95% CI 0.68-0.89, Brier-score 0.16) in the external validation cohort showing a comparable accuracy respect to the 4C score and to the Naïve Bayes model with a-priori chosen features, which achieved an AUC of 0.78(95% CI 0.73-0.83, Brier-score 0.26) and 0.80(95% CI 0.75-0.86, Brier-score 0.17) respectively. CONCLUSIONS A personalized Machine-learning based score with a purely data driven features selection is feasible and effective to predict mortality in patients with COVID-19 pneumonia.


Author(s):  
Michelle Blom ◽  
Slava Shekh ◽  
Don Gossink ◽  
Tim Miller ◽  
Adrian R Pearce

Future defense logistics will be heavily reliant on autonomous vehicles for the transportation of supplies. We consider a dynamic logistics problem in which: multiple supply item types are transported between suppliers and consuming (sink) locations; and autonomous vehicles (road-, sea-, and air-based) make decisions on where to collect and deliver supplies in a decentralized manner. Sink nodes consume dynamically varying demands (whose timing and size are not known a priori). Network arcs, and vehicles, experience failures at times, and for durations, that are not known a priori. These dynamic events are caused by an adversary, seeking to disrupt the network. We design domain-dependent planning algorithms for these vehicles whose primary objective is to minimize the likelihood of stockout events (where insufficient resource is present at a sink to meet demand). Cost minimization is a secondary objective. The performance of these algorithms, across varying scenarios, with and without restrictions on communication between vehicles and network locations, is evaluated using agent-based simulation. We show that stockpiling-based strategies, where quantities of resource are amassed at strategic locations, are most effective on large land-based networks with multiple supply item types, with simpler “shuttling”-based approaches being sufficient otherwise.


Author(s):  
Karen L. Pedersen ◽  
Terri Hayes ◽  
Tim Copeland

This case chronicles the beginnings of an enrollment management transformation currently underway at The Extended Campuses of Northern Arizona University. After flat enrollments for three plus years, the organization executed a phased plan to alter the university’s enrollment trajectory. A complete reorganization, an intentional effort to operationalize enrollment marketing best practice, and the establishment of a data-driven organization comprise the foundations of the first phase of the plan. While specific to Northern Arizona University, the case will also highlight six foundations for initiating any enrollment management transformational journey.


2019 ◽  
Vol 29 ◽  
Author(s):  
S. de Vos ◽  
S. Patten ◽  
E. C. Wit ◽  
E. H. Bos ◽  
K. J. Wardenaar ◽  
...  

Abstract Aims The mechanisms underlying both depressive and anxiety disorders remain poorly understood. One of the reasons for this is the lack of a valid, evidence-based system to classify persons into specific subtypes based on their depressive and/or anxiety symptomatology. In order to do this without a priori assumptions, non-parametric statistical methods seem the optimal choice. Moreover, to define subtypes according to their symptom profiles and inter-relations between symptoms, network models may be very useful. This study aimed to evaluate the potential usefulness of this approach. Methods A large community sample from the Canadian general population (N = 254 443) was divided into data-driven clusters using non-parametric k-means clustering. Participants were clustered according to their (co)variation around the grand mean on each item of the Kessler Psychological Distress Scale (K10). Next, to evaluate cluster differences, semi-parametric network models were fitted in each cluster and node centrality indices and network density measures were compared. Results A five-cluster model was obtained from the cluster analyses. Network density varied across clusters, and was highest for the cluster of people with the lowest K10 severity ratings. In three cluster networks, depressive symptoms (e.g. feeling depressed, restless, hopeless) had the highest centrality. In the remaining two clusters, symptom networks were characterised by a higher prominence of somatic symptoms (e.g. restlessness, nervousness). Conclusion Finding data-driven subtypes based on psychological distress using non-parametric methods can be a fruitful approach, yielding clusters of persons that differ in illness severity as well as in the structure and strengths of inter-symptom relationships.


2019 ◽  
Vol 11 (23) ◽  
pp. 6784
Author(s):  
Suyang Zhou ◽  
Di He ◽  
Zhiyang Zhang ◽  
Zhi Wu ◽  
Wei Gu ◽  
...  

Intra-day control and scheduling of energy systems require high-speed computation and strong robustness. Conventional mathematical driven approaches usually require high computation resources and have difficulty handling system uncertainties. This paper proposes two data-driven scheduling approaches for hydrogen penetrated energy system (HPES) operational scheduling. The two data-driven approaches learn the historical optimization results calculated out using the mixed integer linear programing (MILP) and conditional value at risk (CVaR), respectively. The intra-day rolling optimization mechanism is introduced to evaluate the proposed data-driven scheduling approaches, MILP data-driven approach and CVaR data-driven approach, along with the forecasted renewable generation and load demands. Results show that the two data-driven approaches have lower intra-day operational costs compared with the MILP based method by 1.17% and 0.93%. In addition, the combined cooling and heating plant (CCHP) has a lower frequency of changing the operational states and power output when using the MILP data-driven approach compared with the mathematical driven approaches.


Sign in / Sign up

Export Citation Format

Share Document