scholarly journals A Multinetwork and Machine Learning Examination of Structure and Content in the United States Code

2021 ◽  
Vol 8 ◽  
Author(s):  
Keith Carlson ◽  
Faraz Dadgostari ◽  
Michael A. Livermore ◽  
Daniel N. Rockmore

This paper introduces a novel linked structure-content representation of federal statutory law in the United States and analyzes and quantifies its structure using tools and concepts drawn from network analysis and complexity studies. The organizational component of our representation is based on the explicit hierarchical organization within the United States Code (USC) as well an embedded cross-reference citation network. We couple this structure with a layer of content-based similarity derived from the application of a “topic model” to the USC. The resulting representation is the first that explicitly models the USC as a “multinetwork” or “multilayered network” incorporating hierarchical structure, cross-references, and content. We report several novel descriptive statistics of this multinetwork. These include the results of this first application of the machine learning technique of topic modeling to the USC as well as multiple measures articulating the relationships between the organizational and content network layers. We find a high degree of assortativity of “titles” (the highest level hierarchy within the USC) with related topics. We also present a link prediction task and show that machine learning techniques are able to recover information about structure from content. Success in this prediction task has a natural interpretation as indicating a form of mutual information. We connect the relational findings between organization and content to a measure of “ease of search” in this large hyperlinked document that has implications for the ways in which the structure of the USC supports (or doesn’t support) broad useful access to the law. The measures developed in this paper have the potential to enable comparative work in the study of statutory networks that ranges across time and geography.

Author(s):  
Scott Wark ◽  
Thao Phan

Between 2016 and 2020, Facebook allowed advertisers in the United States to target their advertisements using three broad “ethnic affinity” categories: “African American,” “U.S.-Hispanic,” and “Asian American.” This paper uses the life and death of these “ethnic affinity” categories to argue that they exemplify a novel mode of racialisation made possible by machine learning techniques. These categories worked by analysing users’ preferences and behaviour: they were supposed to capture an “affinity” for a broad demographic group, rather than registering membership of that group. That is, they were supposed to allow advertisers to “personalise” content for users depending on behaviourally determined affinities. We argue that, in effect, Facebook’s ethnic affinity categories were supposed to operationalise a “post-racial” mode of categorising users. But the paradox of personalisation is that in order to apprehend users as individuals, platforms must first assemble them into groups based on their likenesses with other individuals. This article uses an analysis of these categories to argue that even in the absence of data on a user’s race—even after the demise of the categories themselves—users can still be subject to techniques of inclusion or exclusion for discriminatory ends. The inductive machine learning techniques that platforms like Facebook employ to classify users generate “proxies,” like racialised preferences or language use, as racialising substitutes. This article concludes by arguing that Facebook’s ethnic affinity categories in fact typify novel modes of racialisation today.


2019 ◽  
Author(s):  
Sing-Chun Wang ◽  
Yuxuan Wang

Abstract. Occurrences of devastating wildfires have been on the rise in the United States for the past decades. While the environmental controls, including weather, climate, and fuels, are known to play important roles in controlling wildfires, the interrelationships between fires and the environmental controls are highly complex and may not be well represented by traditional parametric regressions. Here we develop a model integrating multiple machine learning algorithms to predict gridded monthly wildfire burned area during 2002–2015 over the South Central United States and identify the relative importance of the environmental drivers on the burned area for both the winter-spring and summer fire seasons of that region. The developed model is able to alleviate the issue of unevenly-distributed burned area data and achieve a cross-validation (CV) R2 value of 0.42 and 0.40 for the two fire seasons. For the total burned area over the study domain, the model can explain 50 % and 79 % of interannual total burned area for the winter-spring and summer fire season, respectively. The prediction model ranks relative humidity (RH) anomalies and preceding months’ drought severity as the top two most important predictors on the gridded burned area for both fire seasons. Sensitivity experiments with the model show that the effect of climate change represented by a group of climate-anomaly variables contributes the most to the burned area for both fire seasons. Antecedent fuel amount and conditions are found to outweigh weather effects for the burned area in the winter-spring fire season, while the current-month fire weather is more important for the summer fire season likely due to the controlling effect of weather on fuel moisture in this season. This developed model allows us to predict gridded burned area and to access specific fire management strategies for different fire mechanisms in the two seasons.


Author(s):  
Momen R. Mousa ◽  
Saleh R. Mousa ◽  
Marwa Hassan ◽  
Paul Carlson ◽  
Ibrahim A. Elnaml

Waterborne paint is the most common marking material used throughout the United States. Because of budget constraints, most transportation agencies repaint their markings based on a fixed schedule, which is questionable in relation to efficiency and economy. To overcome this problem, state agencies could evaluate the marking performance by utilizing measured retroreflectivity of waterborne paints applied in the National Transportation Product Evaluation Program (NTPEP) or by using retroreflectivity degradation models developed in previous studies. Generally, both options lack accuracy because of the high dimensionality and multi-collinearity of retroreflectivity data. Therefore, the objective of this study was to employ an advanced machine learning algorithm to develop performance prediction models for waterborne paints considering the variables that are believed to affect their performance. To achieve this objective, a total of 17,952 skip and wheel retroreflectivity measurements were collected from 10 test decks included in the NTPEP. Based on these data, two CatBoost models were developed with an acceptable level of accuracy which can predict the skip and wheel retroreflectivity of waterborne paints for up to 3 years using only the initial measured retroreflectivity and the anticipated project conditions over the intended prediction horizon, such as line color, traffic, air temperature, and so forth. These models could be used by transportation agencies throughout the United States to 1) compare between different products and select the best product for a specific project, and 2) determine the expected service life of a specific product based on a specified threshold retroreflectivity to plan for future restriping activities.


2021 ◽  
Vol 14 (5) ◽  
pp. 472
Author(s):  
Tyler C. Beck ◽  
Kyle R. Beck ◽  
Jordan Morningstar ◽  
Menny M. Benjamin ◽  
Russell A. Norris

Roughly 2.8% of annual hospitalizations are a result of adverse drug interactions in the United States, representing more than 245,000 hospitalizations. Drug–drug interactions commonly arise from major cytochrome P450 (CYP) inhibition. Various approaches are routinely employed in order to reduce the incidence of adverse interactions, such as altering drug dosing schemes and/or minimizing the number of drugs prescribed; however, often, a reduction in the number of medications cannot be achieved without impacting therapeutic outcomes. Nearly 80% of drugs fail in development due to pharmacokinetic issues, outlining the importance of examining cytochrome interactions during preclinical drug design. In this review, we examined the physiochemical and structural properties of small molecule inhibitors of CYPs 3A4, 2D6, 2C19, 2C9, and 1A2. Although CYP inhibitors tend to have distinct physiochemical properties and structural features, these descriptors alone are insufficient to predict major cytochrome inhibition probability and affinity. Machine learning based in silico approaches may be employed as a more robust and accurate way of predicting CYP inhibition. These various approaches are highlighted in the review.


2021 ◽  
pp. 1-4
Author(s):  
Mathieu D'Aquin ◽  
Stefan Dietze

The 29th ACM International Conference on Information and Knowledge Management (CIKM) was held online from the 19 th to the 23 rd of October 2020. CIKM is an annual computer science conference, focused on research at the intersection of information retrieval, machine learning, databases as well as semantic and knowledge-based technologies. Since it was first held in the United States in 1992, 28 conferences have been hosted in 9 countries around the world.


2021 ◽  
Author(s):  
satya katragadda ◽  
ravi teja bhupatiraju ◽  
vijay raghavan ◽  
ziad ashkar ◽  
raju gottumukkala

Abstract Background: Travel patterns of humans play a major part in the spread of infectious diseases. This was evident in the geographical spread of COVID-19 in the United States. However, the impact of this mobility and the transmission of the virus due to local travel, compared to the population traveling across state boundaries, is unknown. This study evaluates the impact of local vs. visitor mobility in understanding the growth in the number of cases for infectious disease outbreaks. Methods: We use two different mobility metrics, namely the local risk and visitor risk extracted from trip data generated from anonymized mobile phone data across all 50 states in the United States. We analyzed the impact of just using local trips on infection spread and infection risk potential generated from visitors' trips from various other states. We used the Diebold-Mariano test to compare across three machine learning models. Finally, we compared the performance of models, including visitor mobility for all the three waves in the United States and across all 50 states. Results: We observe that visitor mobility impacts case growth and that including visitor mobility in forecasting the number of COVID-19 cases improves prediction accuracy by 34. We found the statistical significance with respect to the performance improvement resulting from including visitor mobility using the Diebold-Mariano test. We also observe that the significance was much higher during the first peak March to June 2020. Conclusion: With presence of cases everywhere (i.e. local and visitor), visitor mobility (even within the country) is shown to have significant impact on growth in number of cases. While it is not possible to account for other factors such as the impact of interventions, and differences in local mobility and visitor mobility, we find that these observations can be used to plan for both reopening and limiting visitors from regions where there are high number of cases.


2021 ◽  
Vol 11 (23) ◽  
pp. 11227
Author(s):  
Arnold Kamis ◽  
Yudan Ding ◽  
Zhenzhen Qu ◽  
Chenchen Zhang

The purpose of this paper is to model the cases of COVID-19 in the United States from 13 March 2020 to 31 May 2020. Our novel contribution is that we have obtained highly accurate models focused on two different regimes, lockdown and reopen, modeling each regime separately. The predictor variables include aggregated individual movement as well as state population density, health rank, climate temperature, and political color. We apply a variety of machine learning methods to each regime: Multiple Regression, Ridge Regression, Elastic Net Regression, Generalized Additive Model, Gradient Boosted Machine, Regression Tree, Neural Network, and Random Forest. We discover that Gradient Boosted Machines are the most accurate in both regimes. The best models achieve a variance explained of 95.2% in the lockdown regime and 99.2% in the reopen regime. We describe the influence of the predictor variables as they change from regime to regime. Notably, we identify individual person movement, as tracked by GPS data, to be an important predictor variable. We conclude that government lockdowns are an extremely important de-densification strategy. Implications and questions for future research are discussed.


2017 ◽  
Author(s):  
◽  
Joe Rexwinkle

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] Arthritis is one of the leading causes of disability in the United States and the second most expensive to treat according to the CDC. One of the key difficulties in diagnosing and treating arthritis, in particular osteoarthritis, is that the mechanisms for progression of the disease are poorly characterized. Mechanical engineer Joe Rexwinkle, working with Dr. Ferris Pfeiffer and the Thompson Lab for Regenerative Orthopaedics, aimed to shed some light on the links between cartilage biology and the degradation seen in osteoarthritis. The study began with obtaining cartilage samples from six patients undergoing total knee replacements and collecting information on several biomarkers with known relevance to osteoarthritis. Specifically, the concentrations of several proteins which may be determined in a standard hospital lab were analyzed. The samples were then tested to determine their mechanical properties, since the progression of osteoarthritis is always accompanied by the physical degradation of the tissue. Machine learning techniques, which are gaining increasing popularity in the field of orthopaedic research, were then used to model the relationships between these biomarkers and the mechanical state of the tissue. These models were found to be highly accurate in characterizing the mechanical state of the tissue, even when limited only to the protein concentrations that one could find in a standard hospital lab. This study has not yet produced a tool which may be used in a hospital setting, considering the low number of patients included in this study, but it does reveal promising early results in using machine learning to characterize osteoarthritis, a task which has thus far eluded the orthopaedic research community.


Sign in / Sign up

Export Citation Format

Share Document