Using eXtreme Gradient BOOSTing to Predict Changes in Tropical Cyclone Intensity over the Western North Pacific

Qingwen Jin; Xiangtao Fan; Jian Liu; Zhuxin Xue; Hongdeng Jian

doi:10.3390/atmos10060341

Using eXtreme Gradient BOOSTing to Predict Changes in Tropical Cyclone Intensity over the Western North Pacific

Atmosphere ◽

10.3390/atmos10060341 ◽

2019 ◽

Vol 10 (6) ◽

pp. 341 ◽

Cited By ~ 4

Author(s):

Qingwen Jin ◽

Xiangtao Fan ◽

Jian Liu ◽

Zhuxin Xue ◽

Hongdeng Jian

Keyword(s):

North Pacific ◽

Western North Pacific ◽

Prediction Models ◽

Weather Prediction ◽

Back Propagation ◽

Absolute Error ◽

Gradient Boosting ◽

Lead Times ◽

Intensity Prediction ◽

Extreme Gradient Boosting

Coastal cities in China are frequently hit by tropical cyclones (TCs), which result in tremendous loss of life and property. Even though the capability of numerical weather prediction models to forecast and track TCs has considerably improved in recent years, forecasting the intensity of a TC is still very difficult; thus, it is necessary to improve the accuracy of TC intensity prediction. To this end, we established a series of predictors using the Best Track TC dataset to predict the intensity of TCs in the Western North Pacific with an eXtreme Gradient BOOSTing (XGBOOST) model. The climatology and persistence factors, environmental factors, brainstorm features, intensity categories, and TC months are considered inputs for the models while the output is the TC intensity. The performance of the XGBOOST model was tested for very strong TCs such as Hato (2017), Rammasum (2014), Mujiage (2015), and Hagupit (2014). The results obtained show that the combination of inputs chosen were the optimal predictors for TC intensification with lead times of 6, 12, 18, and 24 h. Furthermore, the mean absolute error (MAE) of the XGBOOST model was much smaller than the MAEs of a back propagation neural network (BPNN) used to predict TC intensity. The MAEs of the forecasts with 6, 12, 18, and 24 h lead times for the test samples used were 1.61, 2.44, 3.10, and 3.70 m/s, respectively, for the XGBOOST model. The results indicate that the XGBOOST model developed in this study can be used to improve TC intensity forecast accuracy and can be considered a better alternative to conventional operational forecast models for TC intensity prediction.

Download Full-text

Large Tropical Cyclone Track Forecast Errors of Global Numerical Weather Prediction Models in western North Pacific Basin

Tropical Cyclone Research and Review ◽

10.1016/j.tcrr.2021.07.001 ◽

2021 ◽

Author(s):

Chi Kit Tang ◽

Johnny C.L. Chan ◽

Munehiko Yamaguchi

Keyword(s):

North Pacific ◽

Western North Pacific ◽

Prediction Models ◽

Weather Prediction ◽

Track Forecast ◽

Forecast Errors ◽

Cyclone Track ◽

Tropical Cyclone Track ◽

Numerical Weather Prediction Models ◽

North Pacific Basin

Download Full-text

Operational Evaluation of a Selective Consensus in the Western North Pacific Basin

Weather and Forecasting ◽

10.1175/waf991.1 ◽

2007 ◽

Vol 22 (3) ◽

pp. 671-675 ◽

Cited By ~ 9

Author(s):

Charles R. Sampson ◽

John A. Knaff ◽

Edward M. Fukada

Keyword(s):

North Pacific ◽

Western North Pacific ◽

Systematic Approach ◽

Prediction Models ◽

Weather Prediction ◽

Pacific Basin ◽

Joint Typhoon Warning Center ◽

Numerical Weather Prediction Models ◽

North Pacific Basin

Abstract The Systematic Approach Forecast Aid (SAFA) has been in use at the Joint Typhoon Warning Center since the 2000 western North Pacific season. SAFA is a system designed for determination of erroneous 72-h track forecasts through identification of predefined error mechanisms associated with numerical weather prediction models. A metric for the process is a selective consensus in which model guidance suspected to have 72-h error greater than 300 n mi (1 n mi = 1.85 km) is first eliminated prior to calculating the average of the remaining model tracks. The resultant selective consensus should then provide improved forecasts over the nonselective consensus. In the 5 yr since its introduction into JTWC operations, forecasters have been unable to produce a selective consensus that provides consistent improved guidance over the nonselective consensus. Also, the rate at which forecasters exercised the selective consensus option dropped from approximately 45% of all forecasts in 2000 to 3% in 2004.

Download Full-text

Operational Performance of a New Barotropic Model (WBAR) in the Western North Pacific Basin

Weather and Forecasting ◽

10.1175/waf939.1 ◽

2006 ◽

Vol 21 (4) ◽

pp. 656-662 ◽

Cited By ~ 17

Author(s):

Charles R. Sampson ◽

James S. Goerss ◽

Harry C. Weber

Keyword(s):

North Pacific ◽

Western North Pacific ◽

Prediction Models ◽

Weather Prediction ◽

Track Forecast ◽

Forecast Performance ◽

Barotropic Model ◽

The North ◽

Forecasting System ◽

The Impact

Abstract The Weber barotropic model (WBAR) was originally developed using predefined 850–200-hPa analyses and forecasts from the NCEP Global Forecasting System. The WBAR tropical cyclone (TC) track forecast performance was found to be competitive with that of more complex numerical weather prediction models in the North Atlantic. As a result, WBAR was revised to incorporate the Navy Operational Global Atmospheric Prediction System (NOGAPS) analyses and forecasts for use at the Joint Typhoon Warning Center (JTWC). The model was also modified to analyze its own storm-dependent deep-layer mean fields from standard NOGAPS pressure levels. Since its operational installation at the JTWC in May 2003, WBAR TC track forecast performance has been competitive with the performance of other more complex NWP models in the western North Pacific. Its TC track forecast performance combined with its high availability rate (93%–95%) has warranted its inclusion in the JTWC operational consensus. The impact of WBAR on consensus TC track forecast performance has been positive and WBAR has added to the consensus forecast availability (i.e., having at least two models to provide a consensus forecast).

Download Full-text

Tropical Cyclone–like Vortices Detection in the NCEP 16-Day Ensemble System over the Western North Pacific in 2008: Application and Forecast Evaluation

Weather and Forecasting ◽

10.1175/2010waf2222415.1 ◽

2011 ◽

Vol 26 (1) ◽

pp. 77-93 ◽

Cited By ~ 6

Author(s):

Hsiao-Chung Tsai ◽

Kuo-Chen Lu ◽

Russell L. Elsberry ◽

Mong-Ming Lu ◽

Chung-Hsiung Sui

Keyword(s):

Tropical Cyclone ◽

North Pacific ◽

Western North Pacific ◽

Prediction Models ◽

Weather Prediction ◽

Domain Size ◽

Ensemble Forecast ◽

Primary Objective ◽

Mature Stage ◽

Evaluation Tool

Abstract An automated technique has been developed for the detection and tracking of tropical cyclone–like vortices (TCLVs) in numerical weather prediction models, and especially for ensemble-based models. A TCLV is detected in the model grid when selected dynamic and thermodynamic fields meet specified criteria. A backward-and-forward extension from the mature stage of the track is utilized to complete the track. In addition, a fuzzy logic approach is utilized to calculate the TCLV fuzzy combined-likelihood value (TFCV) for representing the TCLV characteristics in the ensemble forecast outputs. The primary objective of the TCLV tracking and TFCV maps is for use as an evaluation tool for the operational forecasters. It is demonstrated that this algorithm efficiently extracts western North Pacific TCLV information from the vast amount of ensemble data from the NCEP Global Ensemble Forecast System (GEFS). The predictability of typhoon formation and activity during June–December 2008 is also evaluated. The TCLV track numbers and TFCV averages around the formation locations during the 0–96-h period are more skillful than for the 102–384-h forecasts. Compared to weak tropical cyclones (TCs; maximum intensity ≤ 50 kt), the storms that eventually become stronger TCs do have larger TFCVs. Depending on the specified domain size and the ensemble track numbers to define a forecast event, some skill is indicated in predicting the named TC activity. Although this evaluation with the 2008 typhoon season indicates some potential, an evaluation with a larger sample is necessary to statistically verify the reliability of the GEFS forecasts.

Download Full-text

Development and Implementation of a Statistical Typoon Intensity Prediction Scheme for the Western North Pacific

10.21236/ada404023 ◽

2002 ◽

Author(s):

John A. Knaff

Keyword(s):

North Pacific ◽

Western North Pacific ◽

Intensity Prediction ◽

Prediction Scheme

Download Full-text

High-Resolution Seeded Simulations of Western North Pacific Ocean Tropical Cyclones in Two Future Extreme Climates

Journal of Climate ◽

10.1175/jcli-d-18-0353.1 ◽

2018 ◽

Vol 32 (2) ◽

pp. 309-334

Author(s):

J. G. McLay ◽

E. A. Hendricks ◽

J. Moskaitis

Keyword(s):

High Resolution ◽

Pacific Ocean ◽

Tropical Cyclones ◽

North Pacific ◽

Western North Pacific ◽

North Pacific Ocean ◽

Weather Prediction ◽

Bootstrap Analysis ◽

Western North Pacific Ocean ◽

The Future

ABSTRACT A variant of downscaling is devised to explore the properties of tropical cyclones (TCs) that originate in the open ocean of the western North Pacific Ocean (WestPac) region under extreme climates. This variant applies a seeding strategy in large-scale environments simulated by phase 5 of the Coupled Model Intercomparison Project (CMIP5) climate-model integrations together with embedded integrations of Coupled Ocean–Atmosphere Mesoscale Prediction System for Tropical Cyclones (COAMPS-TC), an operational, high-resolution, nonhydrostatic, convection-permitting numerical weather prediction (NWP) model. Test periods for the present day and late twenty-first century are sampled from two different integrations for the representative concentration pathway (RCP) 8.5 forcing scenario. Then seeded simulations for the present-day period are contrasted with similar seeded simulations for the future period. Reinforcing other downscaling studies, the seeding results suggest that the future environments are notably more conducive to high-intensity TC activity in the WestPac. Specifically, the future simulations yield considerably more TCs that exceed 96-kt (1 kt ≈ 0.5144 m s−1) intensity, and these TCs exhibit notably greater average life cycle maximum intensity and tend to spend more time above the 96-kt intensity threshold. Also, the future simulations yield more TCs that make landfall at >64-kt intensity, and the average landfall intensity of these storms is appreciably greater. These findings are supported by statistical bootstrap analysis as well as by a supplemental sensitivity analysis. Accounting for COAMPS-TC intensity forecast bias using a quantile-matching approach, the seeded simulations suggest that the potential maximum western North Pacific TC intensities in the future extreme climate may be approximately 190 kt.

Download Full-text

Experimental Analysis of GBM to Expand the Time Horizon of Irish Electricity Price Forecasts

Energies ◽

10.3390/en14227587 ◽

2021 ◽

Vol 14 (22) ◽

pp. 7587

Author(s):

Conor Lynch ◽

Christian O’Leary ◽

Preetham Govind Kolar Sundareshan ◽

Yavuz Akin

Keyword(s):

Electricity Market ◽

Electricity Consumption ◽

Cost Effective ◽

Absolute Error ◽

Electricity Price ◽

Gradient Boosting ◽

Avant Garde ◽

Electricity Price Forecasting ◽

Gradient Boosting Machine ◽

Extreme Gradient Boosting

In response to the inherent challenges of generating cost-effective electricity consumption schedules for dynamic systems, this paper espouses the use of GBM or Gradient Boosting Machine-based models for electricity price forecasting. These models are applied to data streams from the Irish electricity market and achieve favorable results, relative to the current state-of-the-art. Presently, electricity prices are published 10 h in advance of the trade day of interest. Using the forecasting methodology outlined in this paper, an estimation of these prices can be made available one day in advance of the official price publication, thus extending the time available to plan electricity utilization from the grid to be as cost effectively as possible. Extreme Gradient Boosting Machine (XGBM) models achieved a Mean Absolute Error (MAE) of 9.93 for data from 30 September 2018 to 12 December 2019 which is an 11.4% improvement on the avant-garde. LGBM models achieve a MAE score 9.58 on more recent data: the full year of 2020.

Download Full-text

Evaluating Modeling and Validation Strategies for Tooth Loss

Journal of Dental Research ◽

10.1177/0022034519864889 ◽

2019 ◽

Vol 98 (10) ◽

pp. 1088-1095 ◽

Cited By ~ 2

Author(s):

J. Krois ◽

C. Graetz ◽

B. Holtfreter ◽

P. Brinkmann ◽

T. Kocher ◽

...

Keyword(s):

Tooth Loss ◽

Predictive Power ◽

Prediction Models ◽

Recursive Partitioning ◽

External Validation ◽

Model Development ◽

Gradient Boosting ◽

Extreme Gradient Boosting ◽

Complex Models ◽

Development And Validation

Prediction models learn patterns from available data (training) and are then validated on new data (testing). Prediction modeling is increasingly common in dental research. We aimed to evaluate how different model development and validation steps affect the predictive performance of tooth loss prediction models of patients with periodontitis. Two independent cohorts (627 patients, 11,651 teeth) were followed over a mean ± SD 18.2 ± 5.6 y (Kiel cohort) and 6.6 ± 2.9 y (Greifswald cohort). Tooth loss and 10 patient- and tooth-level predictors were recorded. The impact of different model development and validation steps was evaluated: 1) model complexity (logistic regression, recursive partitioning, random forest, extreme gradient boosting), 2) sample size (full data set or 10%, 25%, or 75% of cases dropped at random), 3) prediction periods (maximum 10, 15, or 20 y or uncensored), and 4) validation schemes (internal or external by centers/time). Tooth loss was generally a rare event (880 teeth were lost). All models showed limited sensitivity but high specificity. Patients’ age and tooth loss at baseline as well as probing pocket depths showed high variable importance. More complex models (random forest, extreme gradient boosting) had no consistent advantages over simpler ones (logistic regression, recursive partitioning). Internal validation (in sample) overestimated the predictive power (area under the curve up to 0.90), while external validation (out of sample) found lower areas under the curve (range 0.62 to 0.82). Reducing the sample size decreased the predictive power, particularly for more complex models. Censoring the prediction period had only limited impact. When the model was trained in one period and tested in another, model outcomes were similar to the base case, indicating temporal validation as a valid option. No model showed higher accuracy than the no-information rate. In conclusion, none of the developed models would be useful in a clinical setting, despite high accuracy. During modeling, rigorous development and external validation should be applied and reported accordingly.

Download Full-text

Forecasting length-of-day using numerical weather prediction models

Symposium - International Astronomical Union ◽

10.1017/s0074180900119618 ◽

1988 ◽

Vol 128 ◽

pp. 285-286

Author(s):

R. D. Rosen ◽

D. A. Salstein ◽

T. Nehrkorn ◽

J. O. Dickey ◽

T. M. Eubanks ◽

...

Keyword(s):

Numerical Weather Prediction ◽

Prediction Models ◽

Weather Prediction ◽

Lead Times ◽

Length Of Day ◽

Wind Fields ◽

Numerical Weather ◽

Medium Range Forecast ◽

Frequency Changes ◽

Numerical Weather Prediction Models

A new approach to forecasting changes in length-of-day (δl.o.d) with lead times from one to ten days is examined. The approach is based on the high correlation that has been shown to exist between high frequency changes in l.o.d. and those in the atmosphere's angular momentum (M). Because forecasts of tropospheric values of M can be calculated from the zonal wind fields produced by operational numerical weather prediction models, it seems worth investigating whether these forecasts are sufficiently skillful to use to infer the evolution of δl.o.d. Here, we examine the quality of M forecasts made by the Medium Range Forecast (MRF) model of the U.S. National Meteorological Center (NMC). By comparing these forecasts against those based on a simple model of persistence, we find that skillful forecasts of M are being achieved on average by the MRF, although there has been much month-to-month variability in forecast quality. Overall, our results indicate that for prediction lead times of 1–10 days, dynamically-based forecasts of δl.o.d. represent a viable alternative to the empirical approaches currently in use.

Download Full-text

Clinical and Laboratory Predictors of In-hospital Mortality in Patients With Coronavirus Disease-2019: A Cohort Study in Wuhan, China

Clinical Infectious Diseases ◽

10.1093/cid/ciaa538 ◽

2020 ◽

Vol 71 (16) ◽

pp. 2079-2088 ◽

Cited By ~ 52

Author(s):

Kun Wang ◽

Peiyuan Zuo ◽

Yuwei Liu ◽

Meng Zhang ◽

Xiaofang Zhao ◽

...

Keyword(s):

Hospital Mortality ◽

Prediction Models ◽

Area Under The Curve ◽

Mortality Prediction ◽

Gradient Boosting ◽

Laboratory Model ◽

Training Cohort ◽

Clinical Model ◽

Extreme Gradient Boosting ◽

Mortality Prediction Models

Abstract Background This study aimed to develop mortality-prediction models for patients with coronavirus disease-2019 (COVID-19). Methods The training cohort included consecutive COVID-19 patients at the First People’s Hospital of Jiangxia District in Wuhan, China, from 7 January 2020 to 11 February 2020. We selected baseline data through the stepwise Akaike information criterion and ensemble XGBoost (extreme gradient boosting) model to build mortality-prediction models. We then validated these models by randomly collected COVID-19 patients in Union Hospital, Wuhan, from 1 January 2020 to 20 February 2020. Results A total of 296 COVID-19 patients were enrolled in the training cohort; 19 died during hospitalization and 277 discharged from the hospital. The clinical model developed using age, history of hypertension, and coronary heart disease showed area under the curve (AUC), 0.88 (95% confidence interval [CI], .80–.95); threshold, −2.6551; sensitivity, 92.31%; specificity, 77.44%; and negative predictive value (NPV), 99.34%. The laboratory model developed using age, high-sensitivity C-reactive protein, peripheral capillary oxygen saturation, neutrophil and lymphocyte count, d-dimer, aspartate aminotransferase, and glomerular filtration rate had a significantly stronger discriminatory power than the clinical model (P = .0157), with AUC, 0.98 (95% CI, .92–.99); threshold, −2.998; sensitivity, 100.00%; specificity, 92.82%; and NPV, 100.00%. In the subsequent validation cohort (N = 44), the AUC (95% CI) was 0.83 (.68–.93) and 0.88 (.75–.96) for the clinical model and laboratory model, respectively. Conclusions We developed 2 predictive models for the in-hospital mortality of patients with COVID-19 in Wuhan that were validated in patients from another center.

Download Full-text