scholarly journals Uncovering the structure of public procurement transactions

2019 ◽  
Vol 21 (3) ◽  
pp. 351-384 ◽  
Author(s):  
Mircea Popa

AbstractClose ties between government authorities and private firms are often the object of suspicion, but a systematic understanding of when they arise is still missing. This article uses machine learning tools to analyze a large dataset of public contracts from across Europe, in order to identify the conditions under which close connections, defined both in terms of repeated interaction, as well as geographical dispersion, appear. Previous theoretical results suggest that close ties should emerge as an enforcement mechanism in settings characterized by weak outside enforcement, such as those involving corruption. Results from random forest models show support for this hypothesis, along with identifying other structural determinants of the outcome. The most striking finding is that even after accounting for numerous potential confounders, major differences in terms of average diversity levels between countries persist, and these differences map onto an indicator of governance quality and corruption, but not at all on income per capita. These findings point to the centrality of the structure of interactions between private and public actors for understanding governance outcomes.

Atmosphere ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 109
Author(s):  
Ashima Malik ◽  
Megha Rajam Rao ◽  
Nandini Puppala ◽  
Prathusha Koouri ◽  
Venkata Anil Kumar Thota ◽  
...  

Over the years, rampant wildfires have plagued the state of California, creating economic and environmental loss. In 2018, wildfires cost nearly 800 million dollars in economic loss and claimed more than 100 lives in California. Over 1.6 million acres of land has burned and caused large sums of environmental damage. Although, recently, researchers have introduced machine learning models and algorithms in predicting the wildfire risks, these results focused on special perspectives and were restricted to a limited number of data parameters. In this paper, we have proposed two data-driven machine learning approaches based on random forest models to predict the wildfire risk at areas near Monticello and Winters, California. This study demonstrated how the models were developed and applied with comprehensive data parameters such as powerlines, terrain, and vegetation in different perspectives that improved the spatial and temporal accuracy in predicting the risk of wildfire including fire ignition. The combined model uses the spatial and the temporal parameters as a single combined dataset to train and predict the fire risk, whereas the ensemble model was fed separate parameters that were later stacked to work as a single model. Our experiment shows that the combined model produced better results compared to the ensemble of random forest models on separate spatial data in terms of accuracy. The models were validated with Receiver Operating Characteristic (ROC) curves, learning curves, and evaluation metrics such as: accuracy, confusion matrices, and classification report. The study results showed and achieved cutting-edge accuracy of 92% in predicting the wildfire risks, including ignition by utilizing the regional spatial and temporal data along with standard data parameters in Northern California.


2012 ◽  
Vol 8 (2) ◽  
pp. 44-63 ◽  
Author(s):  
Baoxun Xu ◽  
Joshua Zhexue Huang ◽  
Graham Williams ◽  
Qiang Wang ◽  
Yunming Ye

The selection of feature subspaces for growing decision trees is a key step in building random forest models. However, the common approach using randomly sampling a few features in the subspace is not suitable for high dimensional data consisting of thousands of features, because such data often contains many features which are uninformative to classification, and the random sampling often doesn’t include informative features in the selected subspaces. Consequently, classification performance of the random forest model is significantly affected. In this paper, the authors propose an improved random forest method which uses a novel feature weighting method for subspace selection and therefore enhances classification performance over high-dimensional data. A series of experiments on 9 real life high dimensional datasets demonstrated that using a subspace size of features where M is the total number of features in the dataset, our random forest model significantly outperforms existing random forest models.


2018 ◽  
Author(s):  
Marion Pfeifer ◽  
Michael JW Boyle ◽  
Stuart Dunning ◽  
Pieter Olivier

Tropical landscapes are changing rapidly due to changes in land use and land management. Being able to predict and monitor land use change impacts on species for conservation or food security concerns requires the use of habitat quality metrics, that are consistent, can be mapped using above - ground sensor data and are relevant for species performance. Here, we focus on ground surface temperature (Thermalground) and ground vegetation greenness (NDVIdown) as potentially suitable metrics of habitat quality. We measure both across habitats differing in tree cover (natural grassland to forest edges to forests and tree plantations) in the human-modified coastal forested landscapes of Kwa-Zulua Natal, South Africa. We show that both habitat quality metrics decline linearly as a function of increasing canopy closure (FCover, %) and canopy leaf area index (LAI). Opening canopies by about 20% or reducing canopy leaf area by 1% would result in an increase of temperatures on the ground by more than 1°C, and an increase in ground vegetation greenness by 0.2 and 0.14 respectively. Upscaling LAI and FCover to develop maps from Landsat imagery using random forest models allowed us to map Thermalground and NDVIdown using the linear relationships. However, map accuracy was constrained by the predictive capacity of the random forest models predicting canopy attributes and the linear models linking canopy attributes to the habitat quality metrics. Accounting for micro-scale variation in temperature is seen as essential to improve biodiversity impact predictions. Our upscaling approach suggests that mapping ground surface temperature based on radiation and vegetation properties might be possible, and that canopy cover maps could provide a useful tool for mapping habitat quality metrics that matter to species. However, we need to increase sampling of surface temperature spatially and temporally to improve and validate upscaled models. We also need to link surface temperature maps to demographic traits of species of different threat status or functions in landscapes with different disturbance and management histories testing for generalities in relationships. The derived understanding could then be exploited for targeted landscape restoration that benefits biodiversity conservation and food security sustainably at the landscape scale.


2020 ◽  
Author(s):  
Cameron Brown ◽  
Diego Maldonado ◽  
Antony Vassileiou ◽  
Blair Johnston ◽  
Alastair Florence

<p>Population balance model is a valuable modelling tool which facilitates the optimization and understanding of crystallization processes. However, in order to use this tool, it is necessary to have previous knowledge of the crystallization kinetics, specifically crystal growth and nucleation. The majority of approaches to achieve proper estimations of kinetic parameters required experimental data. Across time, a vast literature about the estimation of kinetic parameters and population balances have been published. Considering the availability of data, this work built a database with information on solute, solvent, kinetic expression, parameters, crystallization method and seeding. Correlations were assessed and clusters structures identified by hierarchical clustering analysis. The final database contains 336 data of kinetic parameters from 185 different sources. The data were analysed using kinetic parameters of the most common expressions. Subsequently, clusters were identified for each kinetic model. With these clusters, classification random forest models were made using solute descriptors, seeding, solvent, and crystallization methods as classifiers. Random forest models had an overall classification accuracy higher than 70% whereby they were useful to provide rough estimates of kinetic parameters, although these methods have some limitations.</p>


2019 ◽  
Author(s):  
Karen-Inge Karstoft ◽  
Ioannis Tsamardinos ◽  
Kasper Eskelund ◽  
Søren Bo Andersen ◽  
Lars Ravnborg Nissen

BACKGROUND Posttraumatic stress disorder (PTSD) is a relatively common consequence of deployment to war zones. Early postdeployment screening with the aim of identifying those at risk for PTSD in the years following deployment will help deliver interventions to those in need but have so far proved unsuccessful. OBJECTIVE This study aimed to test the applicability of automated model selection and the ability of automated machine learning prediction models to transfer across cohorts and predict screening-level PTSD 2.5 years and 6.5 years after deployment. METHODS Automated machine learning was applied to data routinely collected 6-8 months after return from deployment from 3 different cohorts of Danish soldiers deployed to Afghanistan in 2009 (cohort 1, N=287 or N=261 depending on the timing of the outcome assessment), 2010 (cohort 2, N=352), and 2013 (cohort 3, N=232). RESULTS Models transferred well between cohorts. For screening-level PTSD 2.5 and 6.5 years after deployment, random forest models provided the highest accuracy as measured by area under the receiver operating characteristic curve (AUC): 2.5 years, AUC=0.77, 95% CI 0.71-0.83; 6.5 years, AUC=0.78, 95% CI 0.73-0.83. Linear models performed equally well. Military rank, hyperarousal symptoms, and total level of PTSD symptoms were highly predictive. CONCLUSIONS Automated machine learning provided validated models that can be readily implemented in future deployment cohorts in the Danish Defense with the aim of targeting postdeployment support interventions to those at highest risk for developing PTSD, provided the cohorts are deployed on similar missions.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Anne-Maria Holma ◽  
Anu Bask ◽  
Antti Laakso ◽  
Dan Andersson

Purpose This paper aims to develop a framework for switching a service supplier in a supply network. Design/methodology/approach The study builds on existing literature in the field of purchasing and supply management, public procurement (PP) and the Industrial Marketing and Purchasing approach, as well as on an illustrative example case, from the PP context, of a supplier switch in a service delivery process. Findings During a switching process, the buyer must simultaneously manage the ending of a relationship with the incumbent supplier and the beginning of a relationship with a new supplier. Collaboration with the focal suppliers to develop a service process with standardized components prevents disruptions in the service processes and reduces the impact of the switch on the wider network. Research limitations/implications The conceptualization suggested in this paper needs to be further explored in different empirical contexts to assess its practical adequacy. Practical implications Practitioners responsible for service procurement can use the findings to develop collaboration with suppliers, both when it comes to service process development and to the switching process. Furthermore, the authors highlight the importance of ending competencies and the development of an exit plan to conduct a “beautiful exit.” Originality/value The paper integrates relationship initiation and ending studies, as well as procurement process models to develop a refined switching process framework. Many PPs rely on short-term relationships due to the legal obligation to frequently invite suppliers to tender, thus understanding the supplier switching process is important both for private and public sector actors.


Sign in / Sign up

Export Citation Format

Share Document