Production Flow Analysis in a Semiconductor Fab Using Machine Learning Techniques

Ivan Kristianto Singgih

doi:10.3390/pr9030407

Production Flow Analysis in a Semiconductor Fab Using Machine Learning Techniques

Processes ◽

10.3390/pr9030407 ◽

2021 ◽

Vol 9 (3) ◽

pp. 407

Author(s):

Ivan Kristianto Singgih

Keyword(s):

Machine Learning ◽

Real Time ◽

Parallel Machines ◽

Production Control ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Test Bed ◽

Data Types ◽

Adaptive Boosting ◽

Learning Techniques

In a semiconductor fab, wafer lots are processed in complex sequences with re-entrants and parallel machines. It is necessary to ensure smooth wafer lot flows by detecting potential disturbances in a real-time fashion to satisfy the wafer lots’ demands. This study aims to identify production factors that significantly affect the system’s throughput level and find the best prediction model. The contributions of this study are as follows: (1) this is the first study that applies machine learning techniques to identify important real-time factors that influence throughput in a semiconductor fab; (2) this study develops a test bed in the Anylogic software environment, based on the Intel minifab layout; and (3) this study proposes a data collection scheme for the production control mechanism. As a result, four models (adaptive boosting, gradient boosting, random forest, decision tree) with the best accuracies are selected, and a scheme to reduce the input data types considered in the models is also proposed. After the reduction, the accuracy of each selected model was more than 97.82%. It was found that data related to the machines’ total idle times, processing steps, and machine E have notable influences on the throughput prediction.

Download Full-text

Learning from Imbalanced Educational Data Using Ensemble Machine Learning Algorithms

Webology ◽

10.14704/web/v18si01/web18053 ◽

2021 ◽

Vol 18 (Special Issue 01) ◽

pp. 183-195

Author(s):

Thingbaijam Lenin ◽

N. Chandrasekaran

Keyword(s):

Machine Learning ◽

Random Forest ◽

Missing Values ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Adaptive Boosting ◽

Stochastic Gradient Boosting ◽

Ensemble Machine Learning ◽

Learning Techniques ◽

Student’S Performance

Student’s academic performance is one of the most important parameters for evaluating the standard of any institute. It has become a paramount importance for any institute to identify the student at risk of underperforming or failing or even drop out from the course. Machine Learning techniques may be used to develop a model for predicting student’s performance as early as at the time of admission. The task however is challenging as the educational data required to explore for modelling are usually imbalanced. We explore ensemble machine learning techniques namely bagging algorithm like random forest (rf) and boosting algorithms like adaptive boosting (adaboost), stochastic gradient boosting (gbm), extreme gradient boosting (xgbTree) in an attempt to develop a model for predicting the student’s performance of a private university at Meghalaya using three categories of data namely demographic, prior academic record, personality. The collected data are found to be highly imbalanced and also consists of missing values. We employ k-nearest neighbor (knn) data imputation technique to tackle the missing values. The models are developed on the imputed data with 10 fold cross validation technique and are evaluated using precision, specificity, recall, kappa metrics. As the data are imbalanced, we avoid using accuracy as the metrics of evaluating the model and instead use balanced accuracy and F-score. We compare the ensemble technique with single classifier C4.5. The best result is provided by random forest and adaboost with F-score of 66.67%, balanced accuracy of 75%, and accuracy of 96.94%.

Download Full-text

Feasibility of Machine Learning Algorithms for Predicting the Deformation of Anodic Titanium Films by Modulating Anodization Processes

Materials ◽

10.3390/ma14051089 ◽

2021 ◽

Vol 14 (5) ◽

pp. 1089

Author(s):

Sung-Hee Kim ◽

Chanyoung Jeong

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Multiclass Classification ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Smart Manufacturing ◽

Gradient Boosting ◽

Experimental Conditions ◽

Learning Techniques ◽

Tio2 Nanostructures

This study aims to demonstrate the feasibility of applying eight machine learning algorithms to predict the classification of the surface characteristics of titanium oxide (TiO2) nanostructures with different anodization processes. We produced a total of 100 samples, and we assessed changes in TiO2 nanostructures’ thicknesses by performing anodization. We successfully grew TiO2 films with different thicknesses by one-step anodization in ethylene glycol containing NH4F and H2O at applied voltage differences ranging from 10 V to 100 V at various anodization durations. We found that the thicknesses of TiO2 nanostructures are dependent on anodization voltages under time differences. Therefore, we tested the feasibility of applying machine learning algorithms to predict the deformation of TiO2. As the characteristics of TiO2 changed based on the different experimental conditions, we classified its surface pore structure into two categories and four groups. For the classification based on granularity, we assessed layer creation, roughness, pore creation, and pore height. We applied eight machine learning techniques to predict classification for binary and multiclass classification. For binary classification, random forest and gradient boosting algorithm had relatively high performance. However, all eight algorithms had scores higher than 0.93, which signifies high prediction on estimating the presence of pore. In contrast, decision tree and three ensemble methods had a relatively higher performance for multiclass classification, with an accuracy rate greater than 0.79. The weakest algorithm used was k-nearest neighbors for both binary and multiclass classifications. We believe that these results show that we can apply machine learning techniques to predict surface quality improvement, leading to smart manufacturing technology to better control color appearance, super-hydrophobicity, super-hydrophilicity or batter efficiency.

Download Full-text

Near real-time twitter spam detection with machine learning techniques

International Journal of Computers and Applications ◽

10.1080/1206212x.2020.1751387 ◽

2020 ◽

pp. 1-11 ◽

Cited By ~ 1

Author(s):

Nan Sun ◽

Guanjun Lin ◽

Junyang Qiu ◽

Paul Rimba

Keyword(s):

Machine Learning ◽

Real Time ◽

Machine Learning Techniques ◽

Spam Detection ◽

Learning Techniques

Download Full-text

Predictors of outpatients’ no-show: big data analytics using apache spark

Journal Of Big Data ◽

10.1186/s40537-020-00384-9 ◽

2020 ◽

Vol 7 (1) ◽

Author(s):

Tahani Daghistani ◽

Huda AlGhamdi ◽

Riyad Alshammari ◽

Raed H. AlHazme

Keyword(s):

Machine Learning ◽

Big Data ◽

Negative Impact ◽

Big Data Analytics ◽

Quality Of Healthcare ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Healthcare Organizations ◽

Data Framework ◽

Learning Techniques

AbstractOutpatients who fail to attend their appointments have a negative impact on the healthcare outcome. Thus, healthcare organizations facing new opportunities, one of them is to improve the quality of healthcare. The main challenges is predictive analysis using techniques capable of handle the huge data generated. We propose a big data framework for identifying subject outpatients’ no-show via feature engineering and machine learning (MLlib) in the Spark platform. This study evaluates the performance of five machine learning techniques, using the (2,011,813‬) outpatients’ visits data. Conducting several experiments and using different validation methods, the Gradient Boosting (GB) performed best, resulting in an increase of accuracy and ROC to 79% and 81%, respectively. In addition, we showed that exploring and evaluating the performance of the machine learning models using various evaluation methods is critical as the accuracy of prediction can significantly differ. The aim of this paper is exploring factors that affect no-show rate and can be used to formulate predictions using big data machine learning techniques.

Download Full-text

Near real-time air quality forecasts using the NASA GEOS model

10.5194/egusphere-egu21-13587 ◽

2021 ◽

Author(s):

K. Emma Knowland ◽

Christoph Keller ◽

Krzysztof Wargan ◽

Brad Weir ◽

Pamela Wales ◽

...

Keyword(s):

Machine Learning ◽

Air Quality ◽

Real Time ◽

Weather Forecasting ◽

Atmospheric Composition ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Wide Range ◽

Reactive Trace Gases ◽

Assimilation System

<p>NASA's Global Modeling and Assimilation Office (GMAO) produces high-resolution global forecasts for weather, aerosols, and air quality. The NASA Global Earth Observing System (GEOS) model has been expanded to provide global near-real-time 5-day forecasts of atmospheric composition at unprecedented horizontal resolution of 0.25 degrees (~25 km). This composition forecast system (GEOS-CF) combines the operational GEOS weather forecasting model with the state-of-the-science GEOS-Chem chemistry module (version 12) to provide detailed analysis of a wide range of air pollutants such as ozone, carbon monoxide, nitrogen oxides, and fine particulate matter (PM2.5). Satellite observations are assimilated into the system for improved representation of weather and smoke. The assimilation system is being expanded to include chemically reactive trace gases. We discuss current capabilities of the GEOS Constituent Data Assimilation System (CoDAS) to improve atmospheric composition modeling and possible future directions, notably incorporating new observations (TROPOMI, geostationary satellites) and machine learning techniques. We show how machine learning techniques can be used to correct for sub-grid-scale variability, which further improves model estimates at a given observation site.</p>

Download Full-text

Artificial Intelligence and Data Analytics for Virtual Flow Metering

10.2118/204662-ms ◽

2021 ◽

Author(s):

Anton Gryzlov ◽

Liliya Mironova ◽

Sergey Safonov ◽

Muhammad Arsalan

Keyword(s):

Machine Learning ◽

Multiphase Flow ◽

Real Time ◽

Production Control ◽

Measurement Data ◽

Sufficient Accuracy ◽

Operating Conditions ◽

Machine Learning Techniques ◽

Flow Metering ◽

Oil Gas

Abstract Modern challenges in reservoir management have recently faced new opportunities in production control and optimization strategies. These strategies in turn rely on the availability of monitoring equipment, which is used to obtain production rates in real-time with sufficient accuracy. In particular, a multiphase flow meter is a device for measuring the individual rates of oil, gas and water from a well in real-time without separating fluid phases. Currently, there are several technologies available on the market but multiphase flow meters generally incapable to handle all ranges of operating conditions with satisfactory accuracy in addition to being expensive to maintain. Virtual Flow Metering (VFM) is a mathematical technique for the indirect estimation of oil, gas and water flowrates produced from a well. This method uses more readily available data from conventional sensors, such as downhole pressure and temperature gauges, and calculates the multiphase rates by combining physical multiphase models, various measurement data and an optimization algorithm. In this work, a brief overview of the virtual metering methods is presented, which is followed by the application of several advanced machine-learning techniques for a specific case of multiphase production monitoring in a highly dynamic wellbore. The predictive capabilities of different types of machine learning instruments are explored using a model simulated production data. Also, the effect of measurement noise on the quality of estimates is considered. The presented results demonstrate that the data-driven methods are very capable to predict multiphase flow rates with sufficient accuracy and can be considered as a back-up solution for a conventional multiphase meter.

Download Full-text

Machine Learning Techniques for Air Quality Forecasting and Study on Real-Time Air Quality Monitoring

2017 International Conference on Computing, Communication, Control and Automation (ICCUBEA) ◽

10.1109/iccubea.2017.8463746 ◽

2017 ◽

Cited By ~ 1

Author(s):

Varsha Hable-Khandekar ◽

Pravin Srinath

Keyword(s):

Machine Learning ◽

Air Quality ◽

Real Time ◽

Quality Monitoring ◽

Machine Learning Techniques ◽

Air Quality Monitoring ◽

Learning Techniques ◽

Air Quality Forecasting

Download Full-text

A robust intrusion detection system using machine learning techniques for MANET

International Journal of Knowledge-based and Intelligent Engineering Systems ◽

10.3233/kes-200047 ◽

2020 ◽

Vol 24 (3) ◽

pp. 253-260

Author(s):

N. Ravi ◽

G. Ramachandran

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Mobile Computing ◽

Mobile Networks ◽

Intrusion Detection System ◽

Detection System ◽

Mobile Network ◽

Machine Learning Techniques ◽

Test Bed ◽

Learning Techniques

Recent advancement in technologies such as Cloud, Internet of Things etc., leads to the increase usage of mobile computing. Present day mobile computing are too sophisticated and advancement are reaching great heights. Moreover, the present day mobile network suffers due to external and internal intrusions within and outside networks. The existing security systems to protect the mobile networks are incapable to detect the recent attacks. Further, the existing security system completely depends on the traditional signature and rule based approaches. Recent attacks have the property of not fluctuating its behaviour during attack. Hence, a robust Intrusion Detection System (IDS) is desirable. In order to address the above mentioned issue, this paper proposed a robust IDS using Machine Learning Techniques (MLT). The key of using MLT is to utilize the power of ensembles. The ensembles of classifier used in this paper are Random Forest (RF), KNN, Naïve Bayes (NB), etc. The proposed IDS is experimentally tested and validated using a secure test bed. The experimental results also confirms that the proposed IDS is robust enough to withstand and detect any form of intrusions and it is also noted that the proposed IDS outperforms the state of the art IDS with more than 95% accuracy.

Download Full-text

Prediction of probable backorder scenarios in the supply chain using Distributed Random Forest and Gradient Boosting Machine learning techniques

Journal Of Big Data ◽

10.1186/s40537-020-00345-2 ◽

2020 ◽

Vol 7 (1) ◽

Cited By ~ 1

Author(s):

Samiul Islam ◽

Saman Hassanzadeh Amin

Keyword(s):

Machine Learning ◽

Supply Chain ◽

Random Forest ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Learning Techniques ◽

Gradient Boosting Machine

Download Full-text

Estimating Warehouse Rental Price using Machine Learning Techniques

International Journal of Computers Communications & Control ◽

10.15837/ijccc.2018.2.3034 ◽

2018 ◽

Vol 13 (2) ◽

pp. 235-250 ◽

Cited By ~ 3

Author(s):

Yixuan Ma ◽

Zhenji Zhang ◽

Alexander Ihler ◽

Baoxiang Pan

Keyword(s):

Machine Learning ◽

Random Forest ◽

Real Estate ◽

Rapid Development ◽

Supply And Demand ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Logistics Industry ◽

Real Estate Price ◽

Learning Techniques

Boosted by the growing logistics industry and digital transformation, the sharing warehouse market is undergoing a rapid development. Both supply and demand sides in the warehouse rental business are faced with market perturbations brought by unprecedented peer competitions and information transparency. A key question faced by the participants is how to price warehouses in the open market. To understand the pricing mechanism, we built a real world warehouse dataset using data collected from the classified advertisements websites. Based on the dataset, we applied machine learning techniques to relate warehouse price with its relevant features, such as warehouse size, location and nearby real estate price. Four candidate models are used here: Linear Regression, Regression Tree, Random Forest Regression and Gradient Boosting Regression Trees. The case study in the Beijing area shows that warehouse rent is closely related to its location and land price. Models considering multiple factors have better skill in estimating warehouse rent, compared to singlefactor estimation. Additionally, tree models have better performance than the linear model, with the best model (Random Forest) achieving correlation coefficient of 0.57 in the test set. Deeper investigation of feature importance illustrates that distance from the city center plays the most important role in determining warehouse price in Beijing, followed by nearby real estate price and warehouse size.

Download Full-text