Data Set: 187 Weeks of Customer Forecasts and Orders for Microprocessors from Intel Corporation

Author(s):  
Matthew P. Manary ◽  
Sean P. Willems

Problem definition: This data set contains 187 consecutive weeks of Intel microprocessor demand information for all five distribution centers in one of its five sales geographies. For every stock keeping unit (SKU) at every location, the weekly forecasted demand and actual customer orders are provided as well as the SKU’s average selling price category. These data are provided by week and by distribution center, producing 26,114 records in total. Academic/practical relevance: The 86 SKUs in the data set span five product generations. It provides years of product evolution across generations and price points. Methodology: As a data set paper, its purpose is to provide interesting and rich real-world data for researchers developing forecasting, inventory, pricing, and product assortment models. Results: The data set demonstrates the presence of significant forecast bias, heterogeneity of forecast errors between distribution centers, generational differences, product life cycles, and pricing dynamics. Managerial implications: This data set provides access to a rich pricing and sales setting from a major corporation that has not been made available before.

2021 ◽  
pp. 1-13
Author(s):  
Hailin Liu ◽  
Fangqing Gu ◽  
Zixian Lin

Transfer learning methods exploit similarities between different datasets to improve the performance of the target task by transferring knowledge from source tasks to the target task. “What to transfer” is a main research issue in transfer learning. The existing transfer learning method generally needs to acquire the shared parameters by integrating human knowledge. However, in many real applications, an understanding of which parameters can be shared is unknown beforehand. Transfer learning model is essentially a special multi-objective optimization problem. Consequently, this paper proposes a novel auto-sharing parameter technique for transfer learning based on multi-objective optimization and solves the optimization problem by using a multi-swarm particle swarm optimizer. Each task objective is simultaneously optimized by a sub-swarm. The current best particle from the sub-swarm of the target task is used to guide the search of particles of the source tasks and vice versa. The target task and source task are jointly solved by sharing the information of the best particle, which works as an inductive bias. Experiments are carried out to evaluate the proposed algorithm on several synthetic data sets and two real-world data sets of a school data set and a landmine data set, which show that the proposed algorithm is effective.


Author(s):  
Shaoqiang Wang ◽  
Shudong Wang ◽  
Song Zhang ◽  
Yifan Wang

Abstract To automatically detect dynamic EEG signals to reduce the time cost of epilepsy diagnosis. In the signal recognition of electroencephalogram (EEG) of epilepsy, traditional machine learning and statistical methods require manual feature labeling engineering in order to show excellent results on a single data set. And the artificially selected features may carry a bias, and cannot guarantee the validity and expansibility in real-world data. In practical applications, deep learning methods can release people from feature engineering to a certain extent. As long as the focus is on the expansion of data quality and quantity, the algorithm model can learn automatically to get better improvements. In addition, the deep learning method can also extract many features that are difficult for humans to perceive, thereby making the algorithm more robust. Based on the design idea of ResNeXt deep neural network, this paper designs a Time-ResNeXt network structure suitable for time series EEG epilepsy detection to identify EEG signals. The accuracy rate of Time-ResNeXt in the detection of EEG epilepsy can reach 91.50%. The Time-ResNeXt network structure produces extremely advanced performance on the benchmark dataset (Berne-Barcelona dataset) and has great potential for improving clinical practice.


2018 ◽  
Vol 35 (8) ◽  
pp. 1508-1518
Author(s):  
Rosembergue Pereira Souza ◽  
Luiz Fernando Rust da Costa Carmo ◽  
Luci Pirmez

Purpose The purpose of this paper is to present a procedure for finding unusual patterns in accredited tests using a rapid processing method for analyzing video records. The procedure uses the temporal differencing technique for object tracking and considers only frames not identified as statistically redundant. Design/methodology/approach An accreditation organization is responsible for accrediting facilities to undertake testing and calibration activities. Periodically, such organizations evaluate accredited testing facilities. These evaluations could use video records and photographs of the tests performed by the facility to judge their conformity to technical requirements. To validate the proposed procedure, a real-world data set with video records from accredited testing facilities in the field of vehicle safety in Brazil was used. The processing time of this proposed procedure was compared with the time needed to process the video records in a traditional fashion. Findings With an appropriate threshold value, the proposed procedure could successfully identify video records of fraudulent services. Processing time was faster than when a traditional method was employed. Originality/value Manually evaluating video records is time consuming and tedious. This paper proposes a procedure to rapidly find unusual patterns in videos of accredited tests with a minimum of manual effort.


2012 ◽  
Vol 29 (01) ◽  
pp. 1240002 ◽  
Author(s):  
XIANGPEI HU ◽  
HUIMIN WANG ◽  
YUNZENG WANG

Costs of many items drop systematically throughout their life-cycles, due to advances in technology and competition. Motivated by the management of service parts for some high-tech products, this paper studies inventory decisions for such items. In a periodic review setting with stochastic demand, we model the purchasing costs of successive periods as a stochastic and decreasing sequence. Unit selling price of the item is determined as some mark-up of the purchasing cost and, hence, will change over time as well. We consider two specific mark-up models: (1) purchasing cost plus constant-dollar-amount mark-up, and (2) purchasing cost plus constant-percentage mark-up. To maximize the total discounted expected profit, we derive conditions under which myopic policies are optimal for the systems.


Author(s):  
Daniel Steeneck ◽  
Fredrik Eng-Larsson ◽  
Francisco Jauffred

Problem definition: We address the problem of how to estimate lost sales for substitutable products when there is no reliable on-shelf availability (OSA) information. Academic/practical relevance: We develop a novel approach to estimating lost sales using only sales data, a market share estimate, and an estimate of overall availability. We use the method to illustrate the negative consequences of using potentially inaccurate inventory records as indicators of availability. Methodology: We suggest a partially hidden Markov model of OSA to generate probabilistic choice sets and incorporate these probabilistic choice sets into the estimation of a multinomial logit demand model using a nested expectation-maximization algorithm. We highlight the importance of considering inventory reliability problems first through simulation and then by applying the procedure to a data set from a major U.S. retailer. Results: The simulations show that the method converges in seconds and produces estimates with similar or lower bias than state-of-the-art benchmarks. For the product category under consideration at the retailer, our procedure finds lost sales of around 3.0% compared with 0.2% when relying on the inventory record as an indicator of availability. Managerial implications: The method efficiently computes estimates that can be used to improve inventory management and guide managers on how to use their scarce resources to improve stocking execution. The research also shows that ignoring inventory record inaccuracies when estimating lost sales can produce substantially inaccurate estimates, which leads to incorrect parameters in supply chain planning.


Author(s):  
Sajad Badalkhani ◽  
Ramazan Havangi ◽  
Mohsen Farshad

There is an extensive literature regarding multi-robot simultaneous localization and mapping (MRSLAM). In most part of the research, the environment is assumed to be static, while the dynamic parts of the environment degrade the estimation quality of SLAM algorithms and lead to inherently fragile systems. To enhance the performance and robustness of the SLAM in dynamic environments (SLAMIDE), a novel cooperative approach named parallel-map (p-map) SLAM is introduced in this paper. The objective of the proposed method is to deal with the dynamics of the environment, by detecting dynamic parts and preventing the inclusion of them in SLAM estimations. In this approach, each robot builds a limited map in its own vicinity, while the global map is built through a hybrid centralized MRSLAM. The restricted size of the local maps, bounds computational complexity and resources needed to handle a large scale dynamic environment. Using a probabilistic index, the proposed method differentiates between stationary and moving landmarks, based on their relative positions with other parts of the environment. Stationary landmarks are then used to refine a consistent map. The proposed method is evaluated with different levels of dynamism and for each level, the performance is measured in terms of accuracy, robustness, and hardware resources needed to be implemented. The method is also evaluated with a publicly available real-world data-set. Experimental validation along with simulations indicate that the proposed method is able to perform consistent SLAM in a dynamic environment, suggesting its feasibility for MRSLAM applications.


2018 ◽  
Vol 15 (3) ◽  
pp. 18-37 ◽  
Author(s):  
Weifeng Pan ◽  
Jilei Dong ◽  
Kun Liu ◽  
Jing Wang

This article describes how the number of services and their types being so numerous makes accurately discovering desired services become a problem. Service clustering is an effective way to facilitate service discovery. However, the existing approaches are usually designed for a single type of service documents, neglecting to fully use the topic and topological information in service profiles and usage histories. To avoid these limitations, this article presents a novel service clustering approach. It adopts a bipartite network to describe the topological structure of service usage histories and uses a SimRank algorithm to measure the topological similarity of services; It applies Latent Dirichlet Allocation to extract topics from service profiles and further quantifies the topic similarity of services; It quantifies the similarity of services by integrating topological and topic similarities; It uses the Chameleon clustering algorithm to cluster the services. The empirical evaluation on real-world data set highlights the benefits provided by the combination of topological and topic similarities.


2019 ◽  
Vol 2019 (1) ◽  
pp. 266-286 ◽  
Author(s):  
Anselme Tueno ◽  
Florian Kerschbaum ◽  
Stefan Katzenbeisser

Abstract Decision trees are widespread machine learning models used for data classification and have many applications in areas such as healthcare, remote diagnostics, spam filtering, etc. In this paper, we address the problem of privately evaluating a decision tree on private data. In this scenario, the server holds a private decision tree model and the client wants to classify its private attribute vector using the server’s private model. The goal is to obtain the classification while preserving the privacy of both – the decision tree and the client input. After the computation, only the classification result is revealed to the client, while nothing is revealed to the server. Many existing protocols require a constant number of rounds. However, some of these protocols perform as many comparisons as there are decision nodes in the entire tree and others transform the whole plaintext decision tree into an oblivious program, resulting in higher communication costs. The main idea of our novel solution is to represent the tree as an array. Then we execute only d – the depth of the tree – comparisons. Each comparison is performed using a small garbled circuit, which output secret-shares of the index of the next node. We get the inputs to the comparison by obliviously indexing the tree and the attribute vector. We implement oblivious array indexing using either garbled circuits, Oblivious Transfer or Oblivious RAM (ORAM). Using ORAM, this results in the first protocol with sub-linear cost in the size of the tree. We implemented and evaluated our solution using the different array indexing procedures mentioned above. As a result, we are not only able to provide the first protocol with sublinear cost for large trees, but also reduce the communication cost for the large real-world data set “Spambase” from 18 MB to 1[triangleright]2 MB and the computation time from 17 seconds to less than 1 second in a LAN setting, compared to the best related work.


Axioms ◽  
2020 ◽  
Vol 9 (4) ◽  
pp. 130
Author(s):  
Tommi Huotari ◽  
Jyrki Savolainen ◽  
Mikael Collan

This study investigated the performance of a trading agent based on a convolutional neural network model in portfolio management. The results showed that with real-world data the agent could produce relevant trading results, while the agent’s behavior corresponded to that of a high-risk taker. The data used were wide in comparison with earlier reported research and was based on the full set of the S&P 500 stock data for twenty-one years supplemented with selected financial ratios. The results presented are new in terms of the size of the data set used and with regards to the model used. The results provide direction and offer insight into how deep learning methods may be used in constructing automatic trading systems.


Sign in / Sign up

Export Citation Format

Share Document