Robustness of Parametric and Nonparametric Fitting Procedures of Tree-Stem Taper with Alternative Definitions for Validation Data

Sheng-I Yang; Harold E Burkhart

doi:10.1093/jofore/fvaa036

Robustness of Parametric and Nonparametric Fitting Procedures of Tree-Stem Taper with Alternative Definitions for Validation Data

Journal of Forestry ◽

10.1093/jofore/fvaa036 ◽

2020 ◽

Vol 118 (6) ◽

pp. 576-583

Author(s):

Sheng-I Yang ◽

Harold E Burkhart

Keyword(s):

Loblolly Pine ◽

Model Development ◽

Data Partitioning ◽

Parametric Models ◽

Validation Data ◽

Data Set ◽

Stem Taper ◽

Computationally Intensive ◽

Tree Stem ◽

Selection Of

Abstract This study aims to evaluate the robustness of parametric and nonparametric procedures using alternative definitions of validation data for loblolly pine. Specifically, four data division strategies were implemented: random selection of one-third of the trees in the data set, selection of the smallest one-third of the trees by diameter at breast height (DBH), selection of the middle third of the trees by DBH, and selection of the largest third of the trees by DBH. Results indicate that tree taper was predicted reasonably well by both procedures when the smallest, medium-sized, or randomly selected trees were withheld for validation. However, when the largest trees were withheld for validation, diameters predicted by the nonparametric random forest algorithm were considerably less accurate than those predicted by the parametric models, especially for diameters near the tree top. When extrapolation is anticipated, a carefully designed data-partitioning strategy should provide some protection against poor results for given prediction objectives. Study Implications Parametric tree-stem taper models have been widely applied in forestry. Recently, nonparametric methods with computationally intensive algorithms were proposed for estimating tree taper, but reliability of the methods has not been explicitly examined. In practice, models are commonly applied to predict unknown populations, which may vary from the observations used in model development. This study provides insights for natural resource and forest managers to select appropriate validation procedures when developing models for predicting tree-stem taper and examining robustness of parametric and nonparametric fitting of tree-stem taper under varying levels of interpolation/extrapolation from fitting to validation of data.

Download Full-text

The Stunting Tool for Early Prevention: development and external validation of a novel tool to predict risk of stunting in children at 3 years of age

BMJ Global Health ◽

10.1136/bmjgh-2019-001801 ◽

2019 ◽

Vol 4 (6) ◽

pp. e001801

Author(s):

Sarah Hanieh ◽

Sabine Braat ◽

Julie A Simpson ◽

Tran Thi Thu Ha ◽

Thach D Tran ◽

...

Keyword(s):

Characteristic Curve ◽

External Validation ◽

Model Development ◽

Validation Data ◽

Data Set ◽

Growth Faltering ◽

Development Data ◽

Gestational Age At Birth ◽

Development And Validation ◽

The Impact

IntroductionGlobally, an estimated 151 million children under 5 years of age still suffer from the adverse effects of stunting. We sought to develop and externally validate an early life predictive model that could be applied in infancy to accurately predict risk of stunting in preschool children.MethodsWe conducted two separate prospective cohort studies in Vietnam that intensively monitored children from early pregnancy until 3 years of age. They included 1168 and 475 live-born infants for model development and validation, respectively. Logistic regression on child stunting at 3 years of age was performed for model development, and the predicted probabilities for stunting were used to evaluate the performance of this model in the validation data set.ResultsStunting prevalence was 16.9% (172 of 1015) in the development data set and 16.4% (70 of 426) in the validation data set. Key predictors included in the final model were paternal and maternal height, maternal weekly weight gain during pregnancy, infant sex, gestational age at birth, and infant weight and length at 6 months of age. The area under the receiver operating characteristic curve in the validation data set was 0.85 (95% Confidence Interval, 0.80–0.90).ConclusionThis tool applied to infants at 6 months of age provided valid prediction of risk of stunting at 3 years of age using a readily available set of parental and infant measures. Further research is required to examine the impact of preventive measures introduced at 6 months of age on those identified as being at risk of growth faltering at 3 years of age.

Download Full-text

Expression signature based on TP53 target genes doesn't predict response to TP53-MDM2 inhibitor in wild type TP53 tumors

eLife ◽

10.7554/elife.10279 ◽

2015 ◽

Vol 4 ◽

Cited By ~ 11

Author(s):

Dmitriy Sonkin

Keyword(s):

Cell Lines ◽

Target Genes ◽

Gene Signature ◽

Wild Type ◽

Validation Data ◽

Data Set ◽

Expression Signature ◽

Selection Of Patients ◽

Mdm2 Inhibitor ◽

Selection Of

A number of TP53-MDM2 inhibitors are currently under investigation as therapeutic agents in a variety of clinical trials in patients with TP53 wild type tumors. Not all wild type TP53 tumors are sensitive to such inhibitors. In an attempt to improve selection of patients with TP53 wild type tumors, an mRNA expression signature based on 13 TP53 transcriptional target genes was recently developed (Jeay et al. 2015). Careful reanalysis of TP53 status in the study validation data set of cancer cell lines considered to be TP53 wild type detected TP53 inactivating alterations in 23% of cell lines. The subsequent reanalysis of the remaining TP53 wild type cell lines clearly demonstrated that unfortunately the 13-gene signature cannot predict response to TP53-MDM2 inhibitor in TP53 wild type tumors.

Download Full-text

Rhizoctonia Web Blight Development on Container-Grown Azalea in Relation to Time and Environmental Factors

Plant Disease ◽

10.1094/pdis-94-7-0891 ◽

2010 ◽

Vol 94 (7) ◽

pp. 891-897 ◽

Cited By ~ 8

Author(s):

Warren E. Copes ◽

Harald Scherm

Keyword(s):

Disease Progression ◽

Model Development ◽

Disease Onset ◽

Classification And Regression Tree ◽

Weather Variables ◽

Validation Data ◽

Data Set ◽

Cart Analysis ◽

Disease Progress ◽

Web Blight

Rhizoctonia web blight, caused by binucleate Rhizoctonia spp., is an annual problem in the southern United States on container-grown azaleas (Rhododendron spp.) that receive daily irrigation. Disease progress was assessed weekly from mid-May to mid-September on nursery-grown plants at three locations in Mississippi and Alabama in 2006, 2007, and 2008. Disease onset, defined as the appearance of blighted leaves at the exterior canopy of at least one plant, occurred on average on 20 July, and calendar date was a more precise predictor of disease onset than several combined time–weather variables. Disease progress curves exhibited weekly fluctuations around a typically exponential increase in the mean number of symptomatic leaves per plant until early to mid-September, after which web blight severity leveled off or declined due to disease-induced leaf dehiscence and the appearance of new, asymptomatic leaves. Based on the relative increase in the log-transformed number of infected leaves per plant, weekly assessment periods were classified as having slow (≤0%), intermediate (>0 to <10%), or rapid (≥10% increase) disease progress. Three-day moving averages (MA) of various weather variables were calculated, and lagged values (by 5 days) of the MA were used in an attempt to predict disease progress as slow, intermediate, or rapid. Of the periods assessed as having slow disease progress in the 2006–2007 data set (model development data), 90.6% (29 of 32) met at least one of the following heuristically derived criteria for the lagged MA: min. temperature < 20.0°C, max. temperature > 35.0°C, avg. vapor pressure deficit < 2.50 hPa, or day of the year > 240 (28 August). One or more of these same criteria were met in 5 of 16 (31.2%) assessment periods with rapid disease progress, indicating that periods with slow versus rapid disease progression could be distinguished reasonably well based on weather. Results were similar for the 2008 validation data. However, weather variables were not useful in separating periods with either slow or rapid disease progress from those having intermediate progress. Instead, weather variables were most useful when used in a negative-prognosis approach to predict disease progression as being “not rapid” (which includes slow and intermediate periods) or “not slow” (including intermediate and rapid periods). The data set was further analyzed using Classification and Regression Tree (CART) analysis to relate weekly disease progress periods to weather variables. The resulting CART model agreed with the heuristic approach in that temperature variables were more prominent than moisture variables in classifying disease progress periods. With both approaches, satisfactory accuracy was accomplished only with negative-prognoses that classified disease progress periods as not rapid or not slow based on temperature and moisture limits.

Download Full-text

Application of Surrogate Models and Probabilistic Design Methodology to Assess Creep Growth Limit of an Uncooled Turbine Blade

Volume 7A: Structures and Dynamics ◽

10.1115/gt2018-75854 ◽

2018 ◽

Author(s):

Armin Hadadian ◽

Sairam Prabhakar ◽

Bjorn Sjodin ◽

Keith Taylor

Keyword(s):

Turbine Blade ◽

Model Development ◽

Engine Performance ◽

Operating Conditions ◽

Life Assessment ◽

Creep Life ◽

Validation Data ◽

Data Set ◽

Creep Life Assessment ◽

The Impact

Predictive lifing with probabilistic treatment of key variables represents a promising approach to realizing the digital gas turbine of the future. In this paper, we present a predictive model for creep life assessment of an uncooled turbine blade. The model development methodology draws on well-established machine learning principles to develop and validate a surrogate model for creep life from engine performance parameters. Verified creep life results, obtained from 3D non-linear thermo-mechanical finite element simulation for varying engine operating conditions are used as the basis for model development. The selection of model response surface order is studied over a range of models by evaluating normalized residual error on training and uncorrelated validation data sets. A model that is fully quadratic in the data set features is shown to have excellent predictive capability, yielding nominal creep life predictions to within ± 3% on the validation data set. This work then considers probabilistic techniques to evaluate the impact of uncertainty associated with each key factor on the predicted nominal creep life in order to achieve a mandated life target with a defined probability of failure.

Download Full-text

Diagnosis of infection after cardiovascular surgery (DICS): a study protocol for developing and validating a prediction model in prospective observational study

BMJ Open ◽

10.1136/bmjopen-2020-048310 ◽

2021 ◽

Vol 11 (9) ◽

pp. e048310

Author(s):

Hai-Tao Zhang ◽

Xi-Kun Han ◽

Chuang-Shi Wang ◽

He Zhang ◽

Ze-Shi Li ◽

...

Keyword(s):

Predictive Value ◽

Cardiovascular Surgery ◽

Model Development ◽

Area Under The Curve ◽

Electronic Medical Record System ◽

Diagnostic Model ◽

Record System ◽

Validation Data ◽

Data Set ◽

Diagnosis Of Infection

IntroductionPostoperative infection (PI) is one of the main severe complications after cardiovascular surgery. Therefore, antibiotics are routinely used during the first 48 hours after cardiovascular surgery. However, there is no effective method for early diagnosis of infection after cardiovascular surgery, particularly, to determine whether postoperative patients need to prolong the use of antibiotics after the first 48 hours. In this study, we aim to develop and validate a diagnostic model to help identify whether a patient has been infected after surgery and guide the appropriate use of antibiotics.Methods and analysisIn this prospective study, we will develop and validate a diagnostic model to determine whether the patient has a bacterial infection within 48 hours after cardiovascular surgery. Baseline data will be collected through the electronic medical record system. A total of 2700 participants will be recruited (n=2000 for development, n=700 for validation). The primary outcome of the study is the newly PI during the first 48 hours after cardiovascular surgery. Logistic regression penalised with elastic net regularisation will be used for model development and bootstrap and k-fold cross-validation aggregation will be performed for internal validation. The derived model will be also externally validated in patients who are continuously included in another time period (N=700). We will evaluate the calibration and differentiation performance of the model by Hosmer-Lemeshow good of fit test and the area under the curve, respectively. We will report sensitivity, specificity, positive predictive value and negative predictive value in the validation data-set, with a target of 80% sensitivity.Ethics and disseminationEthical approval was obtained from Medical Ethics Committee of Affiliated Nanjing Drum Tower Hospital, Nanjing University Medical College (2020-249-01).Trial registration numberChinese Clinical Trial Register (www.chictr.org.cn, ChiCTR2000038762); Pre-results.

Download Full-text

Development and temporal external validation of a simple risk score tool for prediction of outcomes after severe head injury based on admission characteristics from level-1 trauma centre of India using retrospectively collected data

BMJ Open ◽

10.1136/bmjopen-2020-040778 ◽

2021 ◽

Vol 11 (1) ◽

pp. e040778

Author(s):

Vineet Kumar Kamal ◽

Ravindra Mohan Pandey ◽

Deepak Agrawal

Keyword(s):

Hospital Mortality ◽

External Validation ◽

Trauma Centre ◽

Unfavourable Outcome ◽

Motor Score ◽

Validation Data ◽

Data Set ◽

Development Data ◽

Level 1 ◽

Pupillary Reactivity

ObjectiveTo develop and validate a simple risk scores chart to estimate the probability of poor outcomes in patients with severe head injury (HI).DesignRetrospective.SettingLevel-1, government-funded trauma centre, India.ParticipantsPatients with severe HI admitted to the neurosurgery intensive care unit during 19 May 2010–31 December 2011 (n=946) for the model development and further, data from same centre with same inclusion criteria from 1 January 2012 to 31 July 2012 (n=284) for the external validation of the model.Outcome(s)In-hospital mortality and unfavourable outcome at 6 months.ResultsA total of 39.5% and 70.7% had in-hospital mortality and unfavourable outcome, respectively, in the development data set. The multivariable logistic regression analysis of routinely collected admission characteristics revealed that for in-hospital mortality, age (51–60, >60 years), motor score (1, 2, 4), pupillary reactivity (none), presence of hypotension, basal cistern effaced, traumatic subarachnoid haemorrhage/intraventricular haematoma and for unfavourable outcome, age (41–50, 51–60, >60 years), motor score (1–4), pupillary reactivity (none, one), unequal limb movement, presence of hypotension were the independent predictors as its 95% confidence interval (CI) of odds ratio (OR)_did not contain one. The discriminative ability (area under the receiver operating characteristic curve (95% CI)) of the score chart for in-hospital mortality and 6 months outcome was excellent in the development data set (0.890 (0.867 to 912) and 0.894 (0.869 to 0.918), respectively), internal validation data set using bootstrap resampling method (0.889 (0.867 to 909) and 0.893 (0.867 to 0.915), respectively) and external validation data set (0.871 (0.825 to 916) and 0.887 (0.842 to 0.932), respectively). Calibration showed good agreement between observed outcome rates and predicted risks in development and external validation data set (p>0.05).ConclusionFor clinical decision making, we can use of these score charts in predicting outcomes in new patients with severe HI in India and similar settings.

Download Full-text

A novel ferroptosis related gene signature is associated with prognosis in patients with ovarian serous cystadenocarcinoma

Scientific Reports ◽

10.1038/s41598-021-90126-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Zhixiang Yu ◽

Haiyan He ◽

Yanan Chen ◽

Qiuhe Ji ◽

Min Sun

Keyword(s):

Cox Regression ◽

Expression Profiles ◽

Critical Role ◽

Gene Signature ◽

Cancer Genome ◽

Training Data ◽

Hub Genes ◽

Validation Data ◽

Data Set ◽

Cox Regression Analysis

AbstractOvarian cancer (OV) is a common type of carcinoma in females. Many studies have reported that ferroptosis is associated with the prognosis of OV patients. However, the mechanism by which this occurs is not well understood. We utilized Genotype-Tissue Expression (GTEx) and The Cancer Genome Atlas (TCGA) to identify ferroptosis-related genes in OV. In the present study, we applied Cox regression analysis to select hub genes and used the least absolute shrinkage and selection operator to construct a prognosis prediction model with mRNA expression profiles and clinical data from TCGA. A series of analyses for this signature was performed in TCGA. We then verified the identified signature using International Cancer Genome Consortium (ICGC) data. After a series of analyses, we identified six hub genes (DNAJB6, RB1, VIMP/ SELENOS, STEAP3, BACH1, and ALOX12) that were then used to construct a model using a training data set. The model was then tested using a validation data set and was found to have high sensitivity and specificity. The identified ferroptosis-related hub genes might play a critical role in the mechanism of OV development. The gene signature we identified may be useful for future clinical applications.

Download Full-text

Potential Energy Implications of Connected and Automated Vehicles: Exploring Key Leverage Points through Scenario Screening and Analysis

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198119838840 ◽

2019 ◽

Vol 2673 (5) ◽

pp. 84-94 ◽

Cited By ~ 1

Author(s):

Brian Bush ◽

Laura Vimmerstedt ◽

Jeff Gonder

Keyword(s):

Systems Engineering ◽

Large Scale ◽

Operating Cost ◽

Dynamics Model ◽

Transportation Service ◽

Service Outcomes ◽

Behavioral Parameters ◽

Computationally Intensive ◽

Different Levels ◽

Selection Of

Connected and automated vehicle (CAV) technologies could transform the transportation system over the coming decades, but face vehicle and systems engineering challenges, as well as technological, economic, demographic, and regulatory issues. The authors have developed a system dynamics model for generating, analyzing, and screening self-consistent CAV adoption scenarios. Results can support selection of scenarios for subsequent computationally intensive study using higher-resolution models. The potential for and barriers to large-scale adoption of CAVs have been analyzed using preliminary quantitative data and qualitative understandings of system relationships among stakeholders across the breadth of these issues. Although they are based on preliminary data, the results map possibilities for achieving different levels of CAV adoption and system-wide fuel use and demonstrate the interplay of behavioral parameters such as how consumers value their time versus financial parameters such as operating cost. By identifying the range of possibilities, estimating the associated energy and transportation service outcomes, and facilitating screening of scenarios for more detailed analysis, this work could inform transportation planners, researchers, and regulators.

Download Full-text

Evaluation of the CONSUME and FOFEM fuel consumption models in pine and mixed hardwood forests of the eastern United States

Canadian Journal of Forest Research ◽

10.1139/cjfr-2013-0499 ◽

2014 ◽

Vol 44 (7) ◽

pp. 784-795 ◽

Cited By ~ 10

Author(s):

Susan J. Prichard ◽

Eva C. Karau ◽

Roger D. Ottmar ◽

Maureen C. Kennedy ◽

James B. Cronan ◽

...

Keyword(s):

United States ◽

Fuel Consumption ◽

Prescribed Burning ◽

Forest Type ◽

Fire Effects ◽

Eastern United States ◽

Validation Data ◽

Data Set ◽

Predicted Values ◽

Fine Fuels

Reliable predictions of fuel consumption are critical in the eastern United States (US), where prescribed burning is frequently applied to forests and air quality is of increasing concern. CONSUME and the First Order Fire Effects Model (FOFEM), predictive models developed to estimate fuel consumption and emissions from wildland fires, have not been systematically evaluated for application in the eastern US using the same validation data set. In this study, we compiled a fuel consumption data set from 54 operational prescribed fires (43 pine and 11 mixed hardwood sites) to assess each model’s uncertainties and application limits. Regions of indifference between measured and predicted values by fuel category and forest type represent the potential error that modelers could incur in estimating fuel consumption by category. Overall, FOFEM predictions have narrower regions of indifference than CONSUME and suggest better correspondence between measured and predicted consumption. However, both models offer reliable predictions of live fuel (shrubs and herbaceous vegetation) and 1 h fine fuels. Results suggest that CONSUME and FOFEM can be improved in their predictive capability for woody fuel, litter, and duff consumption for eastern US forests. Because of their high biomass and potential smoke management problems, refining estimates of litter and duff consumption is of particular importance.

Download Full-text

127 Predicting Dry Matter Intake of Gestating and Lactating Beef Cows

Journal of Animal Science ◽

10.1093/jas/skz397.132 ◽

2020 ◽

Vol 98 (Supplement_2) ◽

pp. 58-58

Author(s):

Megan A Gross ◽

Claire Andresen ◽

Amanda Holder ◽

Alexi Moehlenpah ◽

Carla Goad ◽

...

Keyword(s):

Beef Cattle ◽

Milk Yield ◽

Feed Intake ◽

Beef Cows ◽

Multiple Regression Equation ◽

Validation Data ◽

Data Set ◽

Lactating Cows ◽

Downward Bias ◽

Protein Supply

Abstract In 1996, the NASEM beef cattle committee developed and published an equation to estimate cow feed intake using results from studies conducted or published between 1979 and 1993 (Nutrient Requirements of Beef Cattle). The same equation was recommended for use in the most recent version of this publication (2016). The equation is sensitive to cow weight, diet digestibility and milk yield. Our objective was to validate the accuracy of this equation using more recent published and unpublished data. Criteria for inclusion in the validation data set included projects conducted or published within the last ten years, direct measurement of forage intake, adequate protein supply, and pen feeding (no tie stall or metabolism crate data). The validation data set included 29 treatment means for gestating cows and 26 treatment means for lactating cows. Means for the gestating cow data set was 11.4 ± 1.9 kg DMI, 599 ± 77 kg BW, 1.24 ± 0.14 Mcal/kg NEm per kg of feed and lactating cow data set was 14.5 ± 2.0 kg DMI, 532 ± 116.3 kg BW, and 1.26 ± 0.24 Mcal NEm per kg feed, respectively. Non intercept models were used to determine equation accuracy in predicting validation data set DMI. The slope for linear bias in the NASEM gestation equation did not differ from 1 (P = 0.07) with a 3.5% positive bias. However, when the NASEM equation was used to predict DMI in lactating cows, the slope for linear bias significantly differed from 1 (P < 0.001) with a downward bias of 13.7%. Therefore, a new multiple regression equation was developed from the validation data set: DMI= (-4.336 + (0.086427 (BW^.75) + 0.3 (Milk yield)+6.005785(NEm)), (R-squared=0.84). The NASEM equation for gestating beef cows was reasonably accurate while the lactation equation underestimated feed intake.

Download Full-text