scholarly journals Large-Scale Long-Tailed Recognition in an Open World

Author(s):  
Ziwei Liu ◽  
Zhongqi Miao ◽  
Xiaohang Zhan ◽  
Jiayun Wang ◽  
Boqing Gong ◽  
...  
Keyword(s):  
Author(s):  
William H. Guss ◽  
Brandon Houghton ◽  
Nicholay Topin ◽  
Phillip Wang ◽  
Cayden Codel ◽  
...  

The sample inefficiency of standard deep reinforcement learning methods precludes their application to many real-world problems. Methods which leverage human demonstrations require fewer samples but have been researched less. As demonstrated in the computer vision and natural language processing communities, large-scale datasets have the capacity to facilitate research by serving as an experimental and benchmarking platform for new methods. However, existing datasets compatible with reinforcement learning simulators do not have sufficient scale, structure, and quality to enable the further development and evaluation of methods focused on using human examples. Therefore, we introduce a comprehensive, large-scale, simulator-paired dataset of human demonstrations: MineRL. The dataset consists of over 60 million automatically annotated state-action pairs across a variety of related tasks in Minecraft, a dynamic, 3D, open-world environment. We present a novel data collection scheme which allows for the ongoing introduction of new tasks and the gathering of complete state information suitable for a variety of methods. We demonstrate the hierarchality, diversity, and scale of the MineRL dataset. Further, we show the difficulty of the Minecraft domain along with the potential of MineRL in developing techniques to solve key research challenges within it.


Author(s):  
Ismail Ilkan Ceylan ◽  
Adnan Darwiche ◽  
Guy Van den Broeck

Large-scale probabilistic knowledge bases are becoming increasingly important in academia and industry alike. They are constantly extended with new data, powered by modern information extraction tools that associate probabilities with database tuples. In this paper, we revisit the semantics underlying such systems. In particular, the closed-world assumption of probabilistic databases, that facts not in the database have probability zero, clearly conflicts with their everyday use. To address this discrepancy, we propose an open-world probabilistic database semantics, which relaxes the probabilities of open facts to default intervals. For this open-world setting, we lift the existing data complexity dichotomy of probabilistic databases, and propose an efficient evaluation algorithm for unions of conjunctive queries. We also show that query evaluation can become harder for non-monotone queries.


2018 ◽  
Vol 2018 (1) ◽  
pp. 88-108 ◽  
Author(s):  
Anupam Das ◽  
Nikita Borisov ◽  
Edward Chou

Abstract The ability to track users’ activities across different websites and visits is a key tool in advertising and surveillance. The HTML5 DeviceMotion interface creates a new opportunity for such tracking via fingerprinting of smartphone motion sensors. We study the feasibility of carrying out such fingerprinting under real-world constraints and on a large scale. In particular, we collect measurements from several hundred users under realistic scenarios and show that the state-of-the-art techniques provide very low accuracy in these settings. We then improve fingerprinting accuracy by changing the classifier as well as incorporating auxiliary information. We also show how to perform fingerprinting in an open-world scenario where one must distinguish between known and previously unseen users. We next consider the problem of developing fingerprinting countermeasures; we evaluate the usability of a previously proposed obfuscation technique and a newly developed quantization technique via a large-scale user study. We find that both techniques are able to drastically reduce fingerprinting accuracy without significantly impacting the utility of the sensors in web applications.


2015 ◽  
Vol 30 (2) ◽  
pp. 140-156 ◽  
Author(s):  
Philip T. Moore ◽  
Hai V. Pham

AbstractThe concept of personalization in its many forms has gained traction driven by the demands of computer-mediated interactions generally implemented in large-scale distributed systems and ad hoc wireless networks. Personalization requires the identification and selection of entities based on a defined profile (a context); an entity has been defined as a person, place, or physical or computational object. Context employs contextual information that combines to describe an entities current state. Historically, the range of contextual information utilized (in context-aware systems) has been limited to identity, location, and proximate data; there has, however, been advances in the range of data and information addressed. As such, context can be highly dynamic with inherent complexity. In addition, context-aware systems must accommodate constraint satisfaction and preference compliance.This article addresses personalization and context with consideration of the domains and systems to which context has been applied and the nature of the contextual data. The developments in computing and service provision are addressed with consideration of the relationship between the evolving computing landscape and context. There is a discussion around rule strategies and conditional relationships in decision support. Logic systems are addressed with an overview of the open world assumption versus the closed world assumption and the relationship with the Semantic Web. The event-driven rule-based approach, which forms the basis upon which intelligent context processing can be realized, is presented with an evaluation and proof-of-concept. The issues and challenges identified in the research are considered with potential solutions and research directions; alternative approaches to context processing are discussed. The article closes with conclusions and open research questions.


Author(s):  
Tal Friedman ◽  
Guy Van den Broeck

Increasing amounts of available data have led to a heightened need for representing large-scale probabilistic knowledge bases. One approach is to use a probabilistic database, a model with strong assumptions that allow for efficiently answering many interesting queries. Recent work on open-world probabilistic databases strengthens the semantics of these probabilistic databases by discarding the assumption that any information not present in the data must be false. While intuitive, these semantics are not sufficiently precise to give reasonable answers to queries. We propose overcoming these issues by using constraints to restrict this open world. We provide an algorithm for one class of queries, and establish a basic hardness result for another. Finally, we propose an efficient and tight approximation for a large class of queries. 


1999 ◽  
Vol 173 ◽  
pp. 243-248
Author(s):  
D. Kubáček ◽  
A. Galád ◽  
A. Pravda

AbstractUnusual short-period comet 29P/Schwassmann-Wachmann 1 inspired many observers to explain its unpredictable outbursts. In this paper large scale structures and features from the inner part of the coma in time periods around outbursts are studied. CCD images were taken at Whipple Observatory, Mt. Hopkins, in 1989 and at Astronomical Observatory, Modra, from 1995 to 1998. Photographic plates of the comet were taken at Harvard College Observatory, Oak Ridge, from 1974 to 1982. The latter were digitized at first to apply the same techniques of image processing for optimizing the visibility of features in the coma during outbursts. Outbursts and coma structures show various shapes.


1994 ◽  
Vol 144 ◽  
pp. 29-33
Author(s):  
P. Ambrož

AbstractThe large-scale coronal structures observed during the sporadically visible solar eclipses were compared with the numerically extrapolated field-line structures of coronal magnetic field. A characteristic relationship between the observed structures of coronal plasma and the magnetic field line configurations was determined. The long-term evolution of large scale coronal structures inferred from photospheric magnetic observations in the course of 11- and 22-year solar cycles is described.Some known parameters, such as the source surface radius, or coronal rotation rate are discussed and actually interpreted. A relation between the large-scale photospheric magnetic field evolution and the coronal structure rearrangement is demonstrated.


2000 ◽  
Vol 179 ◽  
pp. 205-208
Author(s):  
Pavel Ambrož ◽  
Alfred Schroll

AbstractPrecise measurements of heliographic position of solar filaments were used for determination of the proper motion of solar filaments on the time-scale of days. The filaments have a tendency to make a shaking or waving of the external structure and to make a general movement of whole filament body, coinciding with the transport of the magnetic flux in the photosphere. The velocity scatter of individual measured points is about one order higher than the accuracy of measurements.


Author(s):  
Simon Thomas

Trends in the technology development of very large scale integrated circuits (VLSI) have been in the direction of higher density of components with smaller dimensions. The scaling down of device dimensions has been not only laterally but also in depth. Such efforts in miniaturization bring with them new developments in materials and processing. Successful implementation of these efforts is, to a large extent, dependent on the proper understanding of the material properties, process technologies and reliability issues, through adequate analytical studies. The analytical instrumentation technology has, fortunately, kept pace with the basic requirements of devices with lateral dimensions in the micron/ submicron range and depths of the order of nonometers. Often, newer analytical techniques have emerged or the more conventional techniques have been adapted to meet the more stringent requirements. As such, a variety of analytical techniques are available today to aid an analyst in the efforts of VLSI process evaluation. Generally such analytical efforts are divided into the characterization of materials, evaluation of processing steps and the analysis of failures.


Sign in / Sign up

Export Citation Format

Share Document