Understanding Software-2.0

Malinda Dilhara; Ameya Ketkar; Danny Dig

doi:10.1145/3453478

Understanding Software-2.0

ACM Transactions on Software Engineering and Methodology ◽

10.1145/3453478 ◽

2021 ◽

Vol 30 (4) ◽

pp. 1-42

Author(s):

Malinda Dilhara ◽

Ameya Ketkar ◽

Danny Dig

Keyword(s):

Machine Learning ◽

Longitudinal Study ◽

Empirical Study ◽

Large Scale ◽

Research Directions ◽

Common Practices ◽

Increasing Trend ◽

Usage Patterns ◽

New Research ◽

Shed Light

Enabled by a rich ecosystem of Machine Learning (ML) libraries, programming using learned models , i.e., Software-2.0 , has gained substantial adoption. However, we do not know what challenges developers encounter when they use ML libraries. With this knowledge gap, researchers miss opportunities to contribute to new research directions, tool builders do not invest resources where automation is most needed, library designers cannot make informed decisions when releasing ML library versions, and developers fail to use common practices when using ML libraries. We present the first large-scale quantitative and qualitative empirical study to shed light on how developers in Software-2.0 use ML libraries, and how this evolution affects their code. Particularly, using static analysis we perform a longitudinal study of 3,340 top-rated open-source projects with 46,110 contributors. To further understand the challenges of ML library evolution, we survey 109 developers who introduce and evolve ML libraries. Using this rich dataset we reveal several novel findings. Among others, we found an increasing trend of using ML libraries: The ratio of new Python projects that use ML libraries increased from 2% in 2013 to 50% in 2018. We identify several usage patterns including the following: (i) 36% of the projects use multiple ML libraries to implement various stages of the ML workflows, (ii) developers update ML libraries more often than the traditional libraries , (iii) strict upgrades are the most popular for ML libraries among other update kinds, (iv) ML library updates often result in cascading library updates, and (v) ML libraries are often downgraded (22.04% of cases). We also observed unique challenges when evolving and maintaining Software-2.0 such as (i) binary incompatibility of trained ML models and (ii) benchmarking ML models. Finally, we present actionable implications of our findings for researchers, tool builders, developers, educators, library vendors, and hardware vendors.

Download Full-text

Connecting Histopathology Imaging and Proteomics in Kidney Cancer through Machine Learning

Journal of Clinical Medicine ◽

10.3390/jcm8101535 ◽

2019 ◽

Vol 8 (10) ◽

pp. 1535 ◽

Cited By ~ 7

Author(s):

Francisco Azuaje ◽

Sang-Yoon Kim ◽

Daniel Perez Hernandez ◽

Gunnar Dittmar

Keyword(s):

Machine Learning ◽

Large Scale ◽

Diagnostic Value ◽

Classification Model ◽

Clinical Approach ◽

Proteomics Data ◽

Cell Renal Cell Carcinoma ◽

Molecular Features ◽

Genes Encoding ◽

New Research

Proteomics data encode molecular features of diagnostic value and accurately reflect key underlying biological mechanisms in cancers. Histopathology imaging is a well-established clinical approach to cancer diagnosis. The predictive relationship between large-scale proteomics and H&E-stained histopathology images remains largely uncharacterized. Here we investigate such associations through the application of machine learning, including deep neural networks, to proteomics and histology imaging datasets generated by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) from clear cell renal cell carcinoma patients. We report robust correlations between a set of diagnostic proteins and predictions generated by an imaging-based classification model. Proteins significantly correlated with the histology-based predictions are significantly implicated in immune responses, extracellular matrix reorganization, and metabolism. Moreover, we showed that the genes encoding these proteins also reliably recapitulate the biological associations with imaging-derived predictions based on strong gene–protein expression correlations. Our findings offer novel insights into the integrative modeling of histology and omics data through machine learning, as well as the methodological basis for new research opportunities in this and other cancer types.

Download Full-text

Principles of Sampling in Educational Research in Higher Education

SOCIETY INTEGRATION EDUCATION Proceedings of the International Scientific Conference ◽

10.17770/sie2015vol1.310 ◽

2015 ◽

Vol 1 ◽

pp. 25

Author(s):

Andreas Ahrens ◽

Jelena Zascerinska

Keyword(s):

Higher Education ◽

Empirical Study ◽

Educational Research ◽

Scientific Literature ◽

Research Question ◽

Interpretive Research ◽

Research Paradigm ◽

Research Directions ◽

European Higher Education ◽

New Research

<em>Innovation and creativity in European society are fostered via a dynamic and flexible European higher education based on the integration between education and research at all levels (Communiqué, 2009). The synergy between education and research is effeciently driven via educational research.</em> <em>Sampling as an element of the educational research has a two-fold role: sample size is inter-connected with statistical analysis of the data and generalisation. Against this background, little attention has been given to principles of sampling in educational research. The research question is as follows: what principles form sampling in educational research? The aim of the research is to analyse scientific literature and work out principles of sampling in educational research underpinning elaboration of a new research question for further studies in educational research. The present research involves a process of analysing the meaning of the key concept “principle”. In the empirical study, explorative research was employed. Interpretive research paradigm was used. The empirical study involved six experts from different countries in February 2013 – July 2014. The findings of the research allow drawing the conclusions on the elaborated principles of sampling in educational research. Directions of further research are proposed.</em>

Download Full-text

Advanced Materials Based on Nanosized Hydroxyapatite

Molecules ◽

10.3390/molecules26113190 ◽

2021 ◽

Vol 26 (11) ◽

pp. 3190

Author(s):

Ramón Rial ◽

Michael González-Durruthy ◽

Zhen Liu ◽

Juan M. Ruso

Keyword(s):

Machine Learning ◽

Advanced Materials ◽

Computational Techniques ◽

New Materials ◽

Research Directions ◽

Advantages And Disadvantages ◽

Technological Advances ◽

Nanosized Hydroxyapatite ◽

New Research ◽

The Impact

The development of new materials based on hydroxyapatite has undergone a great evolution in recent decades due to technological advances and development of computational techniques. The focus of this review is the various attempts to improve new hydroxyapatite-based materials. First, we comment on the most used processing routes, highlighting their advantages and disadvantages. We will now focus on other routes, less common due to their specificity and/or recent development. We also include a block dedicated to the impact of computational techniques in the development of these new systems, including: QSAR, DFT, Finite Elements of Machine Learning. In the following part we focus on the most innovative applications of these materials, ranging from medicine to new disciplines such as catalysis, environment, filtration, or energy. The review concludes with an outlook for possible new research directions.

Download Full-text

Emergency Logistics in a Large-Scale Disaster Context: Achievements and Challenges

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph16050779 ◽

2019 ◽

Vol 16 (5) ◽

pp. 779 ◽

Cited By ~ 6

Author(s):

Yiping Jiang ◽

Yufei Yuan

Keyword(s):

Emergency Response ◽

Operations Research ◽

Large Scale ◽

Future Research ◽

Research Directions ◽

Business Operations ◽

Emergency Logistics ◽

Key Characteristics ◽

Future Research Directions ◽

New Research

There is growing research interest in emergency logistics within the operations research (OR) community. Different from normal business operations, emergency response for large scale disasters is very complex and there are many challenges to deal with. Research on emergency logistics is still in its infancy stage. Understanding the challenges and new research directions is very important. In this paper, we present a literature review of emergency logistics in the context of large-scale disasters. The main contributions of our study include three aspects: First, we identify key characteristics of large-scale disasters and assess their challenges to emergency logistics. Second, we analyze and summarize the current literature on how to deal with these challenges. Finally, we discuss existing gaps in the relevant research and suggest future research directions.

Download Full-text

Identifying Semitic Roots: Machine Learning with Linguistic Constraints

Computational Linguistics ◽

10.1162/coli.2008.07-002-r1-06-30 ◽

2008 ◽

Vol 34 (3) ◽

pp. 429-448 ◽

Cited By ~ 5

Author(s):

Ezra Daya ◽

Dan Roth ◽

Shuly Wintner

Keyword(s):

Machine Learning ◽

Large Scale ◽

Morphological Analysis ◽

Full Scale ◽

Linguistic Knowledge ◽

Learning Approach ◽

Semitic Languages ◽

Practical Applications ◽

Machine Learning Approach ◽

Shed Light

Words in Semitic languages are formed by combining two morphemes: a root and a pattern. The root consists of consonants only, by default three, and the pattern is a combination of vowels and consonants, with non-consecutive “slots” into which the root consonants are inserted. Identifying the root of a given word is an important task, considered to be an essential part of the morphological analysis of Semitic languages, and information on roots is important for linguistics research as well as for practical applications. We present a machine learning approach, augmented by limited linguistic knowledge, to the problem of identifying the roots of Semitic words. Although programs exist which can extract the root of words in Arabic and Hebrew, they are all dependent on labor-intensive construction of large-scale lexicons which are components of full-scale morphological analyzers. The advantage of our method is an automation of this process, avoiding the bottleneck of having to laboriously list the root and pattern of each lexeme in the language. To the best of our knowledge, this is the first application of machine learning to this problem, and one of the few attempts to directly address non-concatenative morphology using machine learning. More generally, our results shed light on the problem of combining classifiers under (linguistically motivated) constraints.

Download Full-text

Machine Learning: Algorithms, Real-World Applications and Research Directions

10.20944/preprints202103.0216.v1 ◽

2021 ◽

Author(s):

Iqbal H. Sarker

Keyword(s):

Machine Learning ◽

Real World ◽

Large Scale ◽

Smart Cities ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Research Directions ◽

Digital World ◽

Real World Applications

In the current age of the Fourth Industrial Revolution ($4IR$ or Industry $4.0$), the digital world has a wealth of data, such as Internet of Things (IoT) data, cybersecurity data, mobile data, business data, social media data, health data, etc. To intelligently analyze these data and develop the corresponding real-world applications, the knowledge of artificial intelligence (AI), particularly, machine learning (ML) is the key. Various types of machine learning algorithms such as supervised, unsupervised, semi-supervised, and reinforcement learning exist in the area. Besides, the deep learning, which is part of a broader family of machine learning methods, can intelligently analyze the data on a large scale. In this paper, we present a comprehensive view on these machine learning algorithms that can be applied to enhance the intelligence and the capabilities of an application. Thus, this study's key contribution is explaining the principles of different machine learning techniques and their applicability in various real-world applications areas, such as cybersecurity, smart cities, healthcare, business, agriculture, and many more. We also highlight the challenges and potential research directions based on our study. Overall, this paper aims to serve as a reference point for not only the application developers but also the decision-makers and researchers in various real-world application areas, particularly from the technical point of view.

Download Full-text

Connecting Histopathology Imaging and Proteomics in Kidney Cancer through Machine Learning

10.1101/756288 ◽

2019 ◽

Author(s):

Francisco Azuaje ◽

Sang-Yoon Kim ◽

Daniel Perez Hernandez ◽

Gunnar Dittmar

Keyword(s):

Machine Learning ◽

Large Scale ◽

Diagnostic Value ◽

Classification Model ◽

Clinical Approach ◽

Proteomics Data ◽

Cell Renal Cell Carcinoma ◽

Molecular Features ◽

Genes Encoding ◽

New Research

AbstractProteomics data encode molecular features of diagnostic value and accurately reflect key underlying biological mechanisms in cancers. Histopathology imaging is a well-established clinical approach to cancer diagnosis. The predictive relationship between large-scale proteomics and H&E-stained histopathology images remains largely uncharacterized. Here we investigate such associations through the application of machine learning, including deep neural networks, to proteomics and histology imaging datasets generated by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) from clear cell renal cell carcinoma patients. We report robust correlations between a set of diagnostic proteins and predictions generated by an imaging-based classification model. Proteins significantly correlated with the histology-based predictions are significantly implicated in immune responses, extracellular matrix reorganization and metabolism. Moreover, we showed that the genes encoding these proteins also reliably recapitulate the biological associations with imaging-derived predictions based on strong gene-protein expression correlations. Our findings offer novel insights into the integrative modeling of histology and omics data through machine learning, as well as the methodological basis for new research opportunities in this and other cancer types.

Download Full-text

Goal setting, self-efficacy and performance: New research directions

PsycEXTRA Dataset ◽

10.1037/e518422013-422 ◽

2009 ◽

Author(s):

Remus Ilies ◽

Nikos Dimotakis ◽

Edwin A. Locke

Keyword(s):

Goal Setting ◽

Self Efficacy ◽

Research Directions ◽

And Performance ◽

New Research

Download Full-text

Occupational health disparities among racial and ethnic minorities: Current findings and new research directions

PsycEXTRA Dataset ◽

10.1037/e577572014-101 ◽

2013 ◽

Author(s):

Donald E. Eggerth ◽

Michael A. Flynn ◽

Frederick Leong ◽

Rashaun Roberts

Keyword(s):

Health Disparities ◽

Occupational Health ◽

Ethnic Minorities ◽

Research Directions ◽

Racial And Ethnic Minorities ◽

New Research

Download Full-text

INNOVATION CAPABILITY, ABSORPTIVE CAPACITY AND SMES PERFORMANCE IN PAKISTAN: THE MODERATING EFFECT OF BUSINESS STRATEGY

Journal of Technology and Operations Management ◽

10.32890/jtom2018.13.2.1 ◽

2018 ◽

Vol 13 (Number 2) ◽

pp. 1-11

Author(s):

Muhammad Zulqarnain Arshad ◽

Darwina Arshad

Keyword(s):

Economic Growth ◽

Absorptive Capacity ◽

Business Strategy ◽

Innovation Capability ◽

Research Directions ◽

Conceptual Paper ◽

And Performance ◽

New Research ◽

The Relationship ◽

Crucial Part

The small and medium-sized enterprises (SMEs) play a crucial part in county’s economic growth and a key contributor in country’s GDP. In Pakistan SMEs hold about 90 percent of the total businesses. The performance of SMEs depends upon many factors. The main aim for the research is to examine the relationship between Innovation Capability, Absorptive Capacity and Performance of SMEs in Pakistan. This conceptual paper also extends to the vague revelation on Business Strategy in which act as a moderator between Innovation Capability, Absorptive Capacity and SMEs Performance. Conclusively, this study proposes a new research directions and hypotheses development to examine the relationship among the variables in Pakistan’s SMEs context.

Download Full-text