scholarly journals Productive lifespan and resilience rank can be predicted from on-farm first parity sensor time series but not using a common equation across farms

2019 ◽  
Author(s):  
I. Adriaens ◽  
N.C. Friggens ◽  
W. Ouweltjes ◽  
H. Scott ◽  
B. Aernouts ◽  
...  

ABSTRACTA dairy cow’s lifetime resilience and her ability to re-calve gain importance on dairy farms as they affect all aspects of the sustainability of the dairy industry. Many modern farms today have milk meters and activity sensors that accurately measure yield and activity at a high frequency for monitoring purposes. We hypothesized that these same sensors can be used for precision phenotyping of complex traits such as lifetime resilience or productive lifespan. The objective of this study was to investigate if lifetime resilience and productive lifespan of dairy cows can be predicted using sensor-derived proxies of first parity sensor data. We used a data set from 27 Belgian and British dairy farms with an automated milking system containing at least 5 years of successive measurements. All of these farms had milk meter data available, and 13 of these farms were also equipped with activity sensors. This subset was used to investigate the added value of activity meters to improve the model’s prediction accuracy. To rank cows for lifetime resilience, a score was attributed to each cow based on her number of calvings, her 305-day milk yield, her age at first calving, her calving intervals and the days in milk at the moment of culling, taking her entire lifetime into account. Next, this lifetime resilience score was used to rank the cows within their herd resulting in a lifetime resilience ranking. Based on this ranking, the cows were classified in a low (last third), moderate (middle third) or high (first third) resilience category. In total 45 biologically-sound sensor features were defined from the time-series data, including measures of variability, lactation curve shape, milk yield perturbations, activity spikes indicating estrous events and activity dynamics representing health events. These features, calculated on first lactation data, were used to predict the lifetime resilience rank and thus, the classification within the herd (low/moderate/high). Using a specific linear regression model progressively including features stepwise selected at farm level (cut-off P-value of 0.2), classification performances were between 35.9% and 70.0% (46.7 ± 8.0, mean ± standard deviation) for milk yield features only and between 46.7% and 84.0% (55.5 ± 12.1, mean ± standard deviation) for lactation and activity features together. This is respectively 13.7 and 22.2% higher than what random classification would give. Moreover, using these individual farm models, only 3.5% and 2.3% of the cows were classified high while being low and vice versa, while respectively 91.8% and 94.1% of the wrongly classified animals were predicted in an adjacent category. A common equation across farms to predict this rank could not be found, which demonstrates the variability in culling and management strategies across farms and within farms over time. The lack of a common model structure across farms suggests the need to consider local (and evidence based) culling management rules when developing decision support tools for dairy farms. With this study we showed the potential of precision phenotyping of complex traits based on biologically meaningful features derived from readily available sensor data. We conclude that first lactation milk and activity sensor data have the potential to predict cows’ lifetime resilience rankings within farms but that consistency over farms is currently lacking.

AI ◽  
2021 ◽  
Vol 2 (1) ◽  
pp. 48-70
Author(s):  
Wei Ming Tan ◽  
T. Hui Teo

Prognostic techniques attempt to predict the Remaining Useful Life (RUL) of a subsystem or a component. Such techniques often use sensor data which are periodically measured and recorded into a time series data set. Such multivariate data sets form complex and non-linear inter-dependencies through recorded time steps and between sensors. Many current existing algorithms for prognostic purposes starts to explore Deep Neural Network (DNN) and its effectiveness in the field. Although Deep Learning (DL) techniques outperform the traditional prognostic algorithms, the networks are generally complex to deploy or train. This paper proposes a Multi-variable Time Series (MTS) focused approach to prognostics that implements a lightweight Convolutional Neural Network (CNN) with attention mechanism. The convolution filters work to extract the abstract temporal patterns from the multiple time series, while the attention mechanisms review the information across the time axis and select the relevant information. The results suggest that the proposed method not only produces a superior accuracy of RUL estimation but it also trains many folds faster than the reported works. The superiority of deploying the network is also demonstrated on a lightweight hardware platform by not just being much compact, but also more efficient for the resource restricted environment.


2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Ari Wibisono ◽  
Petrus Mursanto ◽  
Jihan Adibah ◽  
Wendy D. W. T. Bayu ◽  
May Iffah Rizki ◽  
...  

Abstract Real-time information mining of a big dataset consisting of time series data is a very challenging task. For this purpose, we propose using the mean distance and the standard deviation to enhance the accuracy of the existing fast incremental model tree with the drift detection (FIMT-DD) algorithm. The standard FIMT-DD algorithm uses the Hoeffding bound as its splitting criterion. We propose the further use of the mean distance and standard deviation, which are used to split a tree more accurately than the standard method. We verify our proposed method using the large Traffic Demand Dataset, which consists of 4,000,000 instances; Tennet’s big wind power plant dataset, which consists of 435,268 instances; and a road weather dataset, which consists of 30,000,000 instances. The results show that our proposed FIMT-DD algorithm improves the accuracy compared to the standard method and Chernoff bound approach. The measured errors demonstrate that our approach results in a lower Mean Absolute Percentage Error (MAPE) in every stage of learning by approximately 2.49% compared with the Chernoff Bound method and 19.65% compared with the standard method.


Author(s):  
Meenakshi Narayan ◽  
Ann Majewicz Fey

Abstract Sensor data predictions could significantly improve the accuracy and effectiveness of modern control systems; however, existing machine learning and advanced statistical techniques to forecast time series data require significant computational resources which is not ideal for real-time applications. In this paper, we propose a novel forecasting technique called Compact Form Dynamic Linearization Model-Free Prediction (CFDL-MFP) which is derived from the existing model-free adaptive control framework. This approach enables near real-time forecasts of seconds-worth of time-series data due to its basis as an optimal control problem. The performance of the CFDL-MFP algorithm was evaluated using four real datasets including: force sensor readings from surgical needle, ECG measurements for heart rate, and atmospheric temperature and Nile water level recordings. On average, the forecast accuracy of CFDL-MFP was 28% better than the benchmark Autoregressive Integrated Moving Average (ARIMA) algorithm. The maximum computation time of CFDL-MFP was 49.1ms which was 170 times faster than ARIMA. Forecasts were best for deterministic data patterns, such as the ECG data, with a minimum average root mean squared error of (0.2±0.2).


Author(s):  
Özge Akkuş ◽  
Volkan Sevinç

This article aims to introduce the use of ordered logit model with time series data in milk productivity studies and determine the important factor levels affecting the milk yield of Holstein Friesians. The data consists of 2002 records collected for the years 2009-2015 from the reports of the Cattle Breeders’ Association of Turkey (CBAT) in Muðla province in Turkey. The direct and marginal effects of the variables: parity, lactation length and year of calving on milk yield are investigated and the probabilities regarding the milk yield production for a given specific parity, lactation length and calving year are calculated. The results show that milk yield slightly increases on the 4th parity of cows. As far as the years concerned, although there had mostly been a steady amount of milk production between 2009 and 2015 years, there was a significant decrease in 2011 and increase in 2014.


2022 ◽  
Vol 3 (1) ◽  
pp. 1-26
Author(s):  
Omid Hajihassani ◽  
Omid Ardakanian ◽  
Hamzeh Khazaei

The abundance of data collected by sensors in Internet of Things devices and the success of deep neural networks in uncovering hidden patterns in time series data have led to mounting privacy concerns. This is because private and sensitive information can be potentially learned from sensor data by applications that have access to this data. In this article, we aim to examine the tradeoff between utility and privacy loss by learning low-dimensional representations that are useful for data obfuscation. We propose deterministic and probabilistic transformations in the latent space of a variational autoencoder to synthesize time series data such that intrusive inferences are prevented while desired inferences can still be made with sufficient accuracy. In the deterministic case, we use a linear transformation to move the representation of input data in the latent space such that the reconstructed data is likely to have the same public attribute but a different private attribute than the original input data. In the probabilistic case, we apply the linear transformation to the latent representation of input data with some probability. We compare our technique with autoencoder-based anonymization techniques and additionally show that it can anonymize data in real time on resource-constrained edge devices.


Mathematics ◽  
2021 ◽  
Vol 9 (17) ◽  
pp. 2146
Author(s):  
Mikhail Zymbler ◽  
Elena Ivanova

Currently, big sensor data arise in a wide spectrum of Industry 4.0, Internet of Things, and Smart City applications. In such subject domains, sensors tend to have a high frequency and produce massive time series in a relatively short time interval. The data collected from the sensors are subject to mining in order to make strategic decisions. In the article, we consider the problem of choosing a Time Series Database Management System (TSDBMS) to provide efficient storing and mining of big sensor data. We overview InfluxDB, OpenTSDB, and TimescaleDB, which are among the most popular state-of-the-art TSDBMSs, and represent different categories of such systems, namely native, add-ons over NoSQL systems, and add-ons over relational DBMSs (RDBMSs), respectively. Our overview shows that, at present, TSDBMSs offer a modest built-in toolset to mine big sensor data. This leads to the use of third-party mining systems and unwanted overhead costs due to exporting data outside a TSDBMS, data conversion, and so on. We propose an approach to managing and mining sensor data inside RDBMSs that exploits the Matrix Profile concept. A Matrix Profile is a data structure that annotates a time series through the index of and the distance to the nearest neighbor of each subsequence of the time series and serves as a basis to discover motifs, anomalies, and other time-series data mining primitives. This approach is implemented as a PostgreSQL extension that allows an application programmer both to compute matrix profiles and mining primitives and to represent them as relational tables. Experimental case studies show that our approach surpasses the above-mentioned out-of-TSDBMS competitors in terms of performance since it assumes that sensor data are mined inside a TSDBMS at no significant overhead costs.


While analyzing iot projects it is very expensive to buy a lot of sensors , corresponding processor boards, power supplies etc. Moreover the entire process is to be replicated to cater to large topologies. The whole experiment is to be planned at a large scale before we can actually start to see analytics working. At a smaller scale this can be implemented as a simulation program in linux where the sensor data is created using a random number generator and scaled appropriately for each type of sensor to mimic representative data. This is them encrypted before sending it over the network to the edge nodes. At the server a socket stream now continuously awaits sensor data Here the required sensor data is retrieved and decrypted to give true time series data. This time series is now given to an analytics engine which calculates the trends and cyclicity and is used to train a neural network. The anomalies so found are properly deciphered. The multiplicity of the nodes can be characterized by having several client programs running in separate terminals. A simple client server architecture is thus able to simulate a large iot infrastructure and is able to perform analytics on a scaled model


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Mahbubul Alam ◽  
Laleh Jalali ◽  
Mahbubul Alam ◽  
Ahmed Farahat ◽  
Chetan Gupta

Abstract—Prognostics aims to predict the degradation of equipment by estimating their remaining useful life (RUL) and/or the failure probability within a specific time horizon. The high demand of equipment prognostics in the industry have propelled researchers to develop robust and efficient prognostics techniques. Among data driven techniques for prognostics, machine learning and deep learning (DL) based techniques, particularly Recurrent Neural Networks (RNNs) have gained significant attention due to their ability of effectively representing the degradation progress by employing dynamic temporal behaviors. RNNs are well known for handling sequential data, especially continuous time series sequential data where the data follows certain pattern. Such data is usually obtained from sensors attached to the equipment. However, in many scenarios sensor data is not readily available and often very tedious to acquire. Conversely, event data is more common and can easily be obtained from the error logs saved by the equipment and transmitted to a backend for further processing. Nevertheless, performing prognostics using event data is substantially more difficult than that of the sensor data due to the unique nature of event data. Though event data is sequential, it differs from other seminal sequential data such as time series and natural language in the following manner, i) unlike time series data, events may appear at any time, i.e., the appearance of events lacks periodicity; ii) unlike natural languages, event data do not follow any specific linguistic rule. Additionally, there may be a significant variability in the event types appearing within the same sequence.  Therefore, this paper proposes an RUL estimation framework to effectively handle the intricate and novel event data. The proposed framework takes discrete events generated by an equipment (e.g., type, time, etc.) as input, and generates for each new event an estimate of the remaining operating cycles in the life of a given component. To evaluate the efficacy of our proposed method, we conduct extensive experiments using benchmark datasets such as the CMAPSS data after converting the time-series data in these datasets to sequential event data. The event data conversion is carried out by careful exploration and application of appropriate transformation techniques to the time series. To the best of our knowledge this is the first time such event-based RUL estimation problem is introduced to the community. Furthermore, we propose several deep learning and machine learning based solution for the event-based RUL estimation problem. Our results suggest that the deep learning models, 1D-CNN, LSTM, and multi-head attention show similar RMSE, MAE and Score performance. Foreseeably, the XGBoost model achieve lower performance compared to the deep learning models since the XGBoost model fails to capture ordering information from the sequence of events. 


In this paper, we analyze, model, predict and cluster Global Active Power, i.e., a time series data obtained at one minute intervals from electricity sensors of a household. We analyze changes in seasonality and trends to model the data. We then compare various forecasting methods such as SARIMA and LSTM to forecast sensor data for the household and combine them to achieve a hybrid model that captures nonlinear variations better than either SARIMA or LSTM used in isolation. Finally, we cluster slices of time series data effectively using a novel clustering algorithm that is a combination of density-based and centroid-based approaches, to discover relevant subtle clusters from sensor data. Our experiments have yielded meaningful insights from the data at both a micro, day-to-day granularity, as well as a macro, weekly to monthly granularity.


2016 ◽  
Vol 10 (04) ◽  
pp. 461-501 ◽  
Author(s):  
Om Prasad Patri ◽  
Anand V. Panangadan ◽  
Vikrambhai S. Sorathia ◽  
Viktor K. Prasanna

Detecting and responding to real-world events is an integral part of any enterprise or organization, but Semantic Computing has been largely underutilized for complex event processing (CEP) applications. A primary reason for this gap is the difference in the level of abstraction between the high-level semantic models for events and the low-level raw data values received from sensor data streams. In this work, we investigate the need for Semantic Computing in various aspects of CEP, and intend to bridge this gap by utilizing recent advances in time series analytics and machine learning. We build upon the Process-oriented Event Model, which provides a formal approach to model real-world objects and events, and specifies the process of moving from sensors to events. We extend this model to facilitate Semantic Computing and time series data mining directly over the sensor data, which provides the advantage of automatically learning the required background knowledge without domain expertise. We illustrate the expressive power of our model in case studies from diverse applications, with particular emphasis on non-intrusive load monitoring in smart energy grids. We also demonstrate that this powerful semantic representation is still highly accurate and performs at par with existing approaches for event detection and classification.


Sign in / Sign up

Export Citation Format

Share Document