Distributed Data Collection through Remote Probing in Windows Environments

Author(s):  
P. Domingues ◽  
P. Marques ◽  
L. Silva
Author(s):  
Cristina G. Wilson ◽  
Feifei Qian ◽  
Douglas J. Jerolmack ◽  
Sonia Roberts ◽  
Jonathan Ham ◽  
...  

AbstractHow do scientists generate and weight candidate queries for hypothesis testing, and how does learning from observations or experimental data impact query selection? Field sciences offer a compelling context to ask these questions because query selection and adaptation involves consideration of the spatiotemporal arrangement of data, and therefore closely parallels classic search and foraging behavior. Here we conduct a novel simulated data foraging study—and a complementary real-world case study—to determine how spatiotemporal data collection decisions are made in field sciences, and how search is adapted in response to in-situ data. Expert geoscientists evaluated a hypothesis by collecting environmental data using a mobile robot. At any point, participants were able to stop the robot and change their search strategy or make a conclusion about the hypothesis. We identified spatiotemporal reasoning heuristics, to which scientists strongly anchored, displaying limited adaptation to new data. We analyzed two key decision factors: variable-space coverage, and fitting error to the hypothesis. We found that, despite varied search strategies, the majority of scientists made a conclusion as the fitting error converged. Scientists who made premature conclusions, due to insufficient variable-space coverage or before the fitting error stabilized, were more prone to incorrect conclusions. We found that novice undergraduates used the same heuristics as expert geoscientists in a simplified version of the scenario. We believe the findings from this study could be used to improve field science training in data foraging, and aid in the development of technologies to support data collection decisions.


Author(s):  
Г.В. Петрухнова ◽  
И.Р. Болдырев

Представлен комплекс технических средств создания для системы сбора данных. Проведена формализация процессов реализации функций контроля технического объекта. Рассматриваемая система сбора данных состоит из функционально законченных устройств, выполняющих определенные функции в контексте работы системы. Данная система, с одной стороны, может быть одним из узлов распределенной системы сбора данных, с другой стороны, может использоваться автономно. Показана актуальность создания системы. В основе разработки использован RISC микроконтроллер STM32H743VIT6, семейства ARM Cortex-M7, работающий на частоте до 400 МГц. К основным модулям системы относятся 20-входовый распределитель напряжения; модуль питания и настройки; модуль цифрового управления; модуль анализа, хранения и передачи данных в управляющий компьютер. Рассмотрен состав и назначение этих модулей. За сбор данных в рассматриваемой системе отвечает цепочка устройств: датчик - схема согласования - АЦП - микроконтроллер. Поскольку в составе системы имеются не только АЦП, но и ЦАП, то на ее базе может быть реализована система управления объектом. Выбор датчиков для снятия информации обусловлен особенностями объекта контроля. Имеется возможность в ручном режиме измерять электрические параметры контуров связи, в том числе обеспечивать проверку питания IDE и SATA-устройств. Представленная система сбора данных является средством, которое может быть использовано для автоматизации процессов контроля состояния технических объектов We present a set of technical means for creating a data collection system. We carried out the formalization of the processes of implementing the control functions of a technical object. The multifunctional data collection system consists of functionally complete devices that perform certain functions in the context of the system operation. This system, on the one hand, can be one of the nodes of a distributed data collection system, on the other hand, it can be used autonomously. We show the relevance of the system creation. The development is based on the RISC microcontroller STM32H743VIT6, ARM Cortex-M7 family, operating at a frequency of up to 400 MHz. The main modules of the system include: a 20-input voltage distributor; a power supply and settings module; a digital control module; a module for analyzing, storing and transmitting data to a control computer. We considered the composition and purpose of these modules. A chain of devices is responsible for data collection in the system under consideration: sensor - matching circuit - ADC - microcontroller. Since the system includes not only an ADC but also a DAC, an object management system can be implemented on its basis. The choice of sensors for taking information is due to the characteristics of the object of control. It is possible to manually measure the electrical parameters of the communication circuits, including checking the power supply of IDE and SATA devices. The presented data collection system is a tool that can be used to automate the processes of monitoring the condition of technical facilities


Author(s):  
Giorgio Audrito ◽  
Roberto Casadei ◽  
Ferruccio Damiani ◽  
Danilo Pianini ◽  
Mirko Viroli

2019 ◽  
Vol 214 ◽  
pp. 04010
Author(s):  
Álvaro Fernández Casaní ◽  
Dario Barberis ◽  
Javier Sánchez ◽  
Carlos García Montoro ◽  
Santiago González de la Hoz ◽  
...  

The ATLAS EventIndex currently runs in production in order to build a complete catalogue of events for experiments with large amounts of data. The current approach is to index all final produced data files at CERN Tier0, and at hundreds of grid sites, with a distributed data collection architecture using Object Stores to temporarily maintain the conveyed information, with references to them sent with a Messaging System. The final backend of all the indexed data is a central Hadoop infrastructure at CERN; an Oracle relational database is used for faster access to a subset of this information. In the future of ATLAS, instead of files, the event should be the atomic information unit for metadata, in order to accommodate future data processing and storage technologies. Files will no longer be static quantities, possibly dynamically aggregating data, and also allowing event-level granularity processing in heavily parallel computing environments. It also simplifies the handling of loss and or extension of data. In this sense the EventIndex may evolve towards a generalized whiteboard, with the ability to build collections and virtual datasets for end users. This proceedings describes the current Distributed Data Collection Architecture of the ATLAS EventIndex project, with details of the Producer, Consumer and Supervisor entities, and the protocol and information temporarily stored in the ObjectStore. It also shows the data flow rates and performance achieved since the new Object Store as temporary store approach was put in production in July 2017. We review the challenges imposed by the expected increasing rates that will reach 35 billion new real events per year in Run 3, and 100 billion new real events per year in Run 4. For simulated events the numbers are even higher, with 100 billion events/year in run 3, and 300 billion events/year in run 4. We also outline the challenges we face in order to accommodate future use cases in the EventIndex.


Author(s):  
Mario José Diván ◽  
María Laura Sánchez-Reynoso

The Internet-of-Things (IoT) has emerged as an alternative to communicate different pieces of technology to foster the distributed data collection. The measurement projects and the Real-time data processing are articulated to take advantage of this environment, fostering a sustainable data-driven decision making. The Data Stream Processing Strategy (DSPS) is a Stream Processing Engine focused on measurement projects, where each concept is previously agreed through a measurement framework. The Measurement Adapter (MA) is a component whose responsibility is to pair each metric’s definition from the measurement project with data sensors to transmit data (i.e., measures) with metadata (i.e., tags indicating the data meaning) together. The Gathering Function (GF) receives and derivates data for its processing from each MA, while it implements load-shedding (LS) techniques based on Metadata to avoid a processing collapse when all MAs informs jointly and frequently. Here, a Metadata and Z-score based load-shedding technique implemented locally in the MA is proposed. Thus, the load-shedding is located at the same data source to avoid data transmission and saving resources. Also, an incremental estimation of average, deviations, covariance, and correlations are implemented and employed to calculate the Z-scores and to data retain/discard selectively. Four simulations discrete were designed and performed to analyze the proposal. Results indicate that the local LS required only 24% of the original data transmissions, a minimum of 18.61 ms as the data lifespan, while it consumes 890.26 KB. As future work, other kinds of dependencies analysis will be analyzed to provide local alternatives to LS.


Sign in / Sign up

Export Citation Format

Share Document