Data‐driven approach to application programming interface documentation mining: A review

Author(s):  
Di Wu ◽  
Xiao‐Yuan Jing ◽  
Hongyu Zhang ◽  
Xiaohui Kong ◽  
Yu Xie ◽  
...  
Author(s):  
Raul Sierra-Alcocer ◽  
Christopher Stephens ◽  
Juan Barrios ◽  
Constantino González‐Salazar ◽  
Juan Carlos Salazar Carrillo ◽  
...  

SPECIES (Stephens et al. 2019) is a tool to explore spatial correlations in biodiversity occurrence databases. The main idea behind the SPECIES project is that the geographical correlations between the distributions of taxa records have useful information. The problem, however, is that if we have thousands of species (Mexico's National System of Biodiversity Information has records of around 70,000 species) then we have millions of potential associations, and exploring them is far from easy. Our goal with SPECIES is to facilitate the discovery and application of meaningful relations hiding in our data. The main variables in SPECIES are the geographical distributions of species occurrence records. Other types of variables, like the climatic variables from WorldClim (Hijmans et al. 2005), are explanatory data that serve for modeling. The system offers two modes of analysis. In one, the user defines a target species, and a selection of species and abiotic variables; then the system computes the spatial correlations between the target species and each of the other species and abiotic variables. The request from the user can be as small as comparing one species to another, or as large as comparing one species to all the species in the database. A user may wonder, for example, which species are usual neighbors of the jaguar, this mode could help answer this question. The second mode of analysis gives a network perspective, in it, the user defines two groups of taxa (and/or environmental variables), the output in this case is a correlation network where the weight of a link between two nodes represents the spatial correlation between the variables that the nodes represent. For example, one group of taxa could be hummingbirds (Trochilidae family) and the second flowers of the Lamiaceae family. This output would help the user analyze which pairs of hummingbird and flower are highly correlated in the database. SPECIES data architecture is optimized to support fast hypotheses prototyping and testing with the analysis of thousands of biotic and abiotic variables. It has a visualization web interface that presents descriptive results to the user at different levels of detail. The methodology in SPECIES is relatively simple, it partitions the geographical space with a regular grid and treats a species occurrence distribution as a present/not present boolean variable over the cells. Given two species (or one species and one abiotic variable) it measures if the number of co-occurrences between the two is more (or less) than expected. If it is more than expected indicates a signal of a positive relation, whereas if it is less it would be evidence of disjoint distributions. SPECIES provides an open web application programming interface (API) to request the computation of correlations and statistical dependencies between variables in the database. Users can create applications that consume this 'statistical web service' or use it directly to further analyze the results in frameworks like R or Python. The project includes an interactive web application that does exactly that: requests analysis from the web service and lets the user experiment and visually explore the results. We believe this approach can be used on one side to augment the services provided from data repositories; and on the other side, facilitate the creation of specialized applications that are clients of these services. This scheme supports big-data-driven research for a wide range of backgrounds because end users do not need to have the technical know-how nor the infrastructure to handle large databases. Currently, SPECIES hosts: all records from Mexico's National Biodiversity Information System (CONABIO 2018) and a subset of Global Biodiversity Information Facility data that covers the contiguous USA (GBIF.org 2018b) and Colombia (GBIF.org 2018a). It also includes discretizations of environmental variables from WorldClim, from the Environmental Rasters for Ecological Modeling project (Title and Bemmels 2018), from CliMond (Kriticos et al. 2012), and topographic variables (USGS EROS Center 1997b, USGS EROS Center 1997a). The long term plan, however, is to incrementally include more data, specially all data from the Global Biodiversity Information Facility. The code of the project is open source, and the repositories are available online (Front-end, Web Services Application Programming Interface, Database Building scripts). This presentation is a demonstration of SPECIES' functionality and its overall design.


2019 ◽  
Vol 9 (2) ◽  
pp. 239 ◽  
Author(s):  
Bruce Ndibanje ◽  
Ki Kim ◽  
Young Kang ◽  
Hyun Kim ◽  
Tae Kim ◽  
...  

Data-driven public security networking and computer systems are always under threat from malicious codes known as malware; therefore, a large amount of research and development is taking place to find effective countermeasures. These countermeasures are mainly based on dynamic and statistical analysis. Because of the obfuscation techniques used by the malware authors, security researchers and the anti-virus industry are facing a colossal issue regarding the extraction of hidden payloads within packed executable extraction. Based on this understanding, we first propose a method to de-obfuscate and unpack the malware samples. Additional, cross-method-based big data analysis to dynamically and statistically extract features from malware has been proposed. The Application Programming Interface (API) call sequences that reflect the malware behavior of its code have been used to detect behavior such as network traffic, modifying a file, writing to stderr or stdout, modifying a registry value, creating a process. Furthermore, we include a similarity analysis and machine learning algorithms to profile and classify malware behaviors. The experimental results of the proposed method show that malware detection accuracy is very useful to discover potential threats and can help the decision-maker to deploy appropriate countermeasures.


2018 ◽  
Vol 9 (1) ◽  
pp. 24-31
Author(s):  
Rudianto Rudianto ◽  
Eko Budi Setiawan

Availability the Application Programming Interface (API) for third-party applications on Android devices provides an opportunity to monitor Android devices with each other. This is used to create an application that can facilitate parents in child supervision through Android devices owned. In this study, some features added to the classification of image content on Android devices related to negative content. In this case, researchers using Clarifai API. The result of this research is to produce a system which has feature, give a report of image file contained in target smartphone and can do deletion on the image file, receive browser history report and can directly visit in the application, receive a report of child location and can be directly contacted via this application. This application works well on the Android Lollipop (API Level 22). Index Terms— Application Programming Interface(API), Monitoring, Negative Content, Children, Parent.


Robotica ◽  
2021 ◽  
pp. 1-31
Author(s):  
Andrew Spielberg ◽  
Tao Du ◽  
Yuanming Hu ◽  
Daniela Rus ◽  
Wojciech Matusik

Abstract We present extensions to ChainQueen, an open source, fully differentiable material point method simulator for soft robotics. Previous work established ChainQueen as a powerful tool for inference, control, and co-design for soft robotics. We detail enhancements to ChainQueen, allowing for more efficient simulation and optimization and expressive co-optimization over material properties and geometric parameters. We package our simulator extensions in an easy-to-use, modular application programming interface (API) with predefined observation models, controllers, actuators, optimizers, and geometric processing tools, making it simple to prototype complex experiments in 50 lines or fewer. We demonstrate the power of our simulator extensions in over nine simulated experiments.


2021 ◽  
Vol 40 (2) ◽  
pp. 55-58
Author(s):  
S. Tucker Taft

The OpenMP specification defines a set of compiler directives, library routines, and environment variables that together represent the OpenMP Application Programming Interface, and is currently defined for C, C++, and Fortran. The forthcoming version of Ada, currently dubbed Ada 202X, includes lightweight parallelism features, in particular parallel blocks and parallel loops. All versions of Ada, since its inception in 1983, have included "tasking," which corresponds to what are traditionally considered "heavyweight" parallelism features, or simply "concurrency" features. Ada "tasks" typically map to what are called "kernel threads," in that the operating system manages them and schedules them. However, one of the goals of lightweight parallelism is to reduce overhead by doing more of the management outside the kernel of the operating system, using a light-weight-thread (LWT) scheduler. The OpenMP library routines support both levels of threading, but for Ada 202X, the main interest is in making use of OpenMP for its lightweight thread scheduling capabilities.


Sign in / Sign up

Export Citation Format

Share Document