scholarly journals Reanalysis of global proteomic and phosphoproteomic data identified a large number of glycopeptides

2017 ◽  
Author(s):  
Yingwei Hu ◽  
Punit Shah ◽  
David J. Clark ◽  
Minghui Ao ◽  
Hui Zhang

ABSTRACTProtein glycosylation plays fundamental roles in many cellular processes, and previous reports have shown dysregulation to be associated with several human diseases, including diabetes, cancer, and neurodegenerative disorders. Despite the vital role of glycosylation for proper protein function, the analysis of glycoproteins has been lagged behind to other protein modifications. In this study, we describe the re-analysis of global proteomic data from breast cancer xenograft tissues using recently developed software package GPQuest 2.0, revealing a large number of previously unidentifiedN-linked glycopeptides. More importantly, we found that using immobilized metal affinity chromatography (IMAC) technology for the enrichment of phosphopeptides had co-enriched a substantial number of sialoglycopeptides, allowing for a large-scale analysis of sialoglycopeptides in conjunction with the analysis of phosphopeptides. Collectively, combined MS/MS analyses of global proteomic and phosphoproteomic datasets resulted in the identification of 6,724 N-linked glycopeptides from 617 glycoproteins derived from two breast cancer xenograft tissues. Next, we utilized GPQuest for the re-analysis of global and phosphoproteomic data generated from 108 human breast cancer tissues that were previously analyzed by Clinical Proteomic Analysis Consortium (CPTAC). Reanalysis of the CPTAC dataset resulted in the identification of 2,683 glycopeptides from the global proteomic data set and 4,554 glycopeptides from phosphoproteomic data set, respectively. Together, 11,292 N-linked glycopeptides corresponding to 1,731 N-linked glycosites from 883 human glycoproteins were identified from the two data sets. This analysis revealed an extensive number of glycopeptides hidden in the global and enriched in IMAC-based phosphopeptide-enriched proteomic data, information which would have remained unknown from the original study otherwise. The reanalysis described herein can be readily applied to identify glycopeptides from already existing data sets, providing insight into many important facets of protein glycosylation in different biological, physiological, and pathological processes.

Author(s):  
Lior Shamir

Abstract Several recent observations using large data sets of galaxies showed non-random distribution of the spin directions of spiral galaxies, even when the galaxies are too far from each other to have gravitational interaction. Here, a data set of $\sim8.7\cdot10^3$ spiral galaxies imaged by Hubble Space Telescope (HST) is used to test and profile a possible asymmetry between galaxy spin directions. The asymmetry between galaxies with opposite spin directions is compared to the asymmetry of galaxies from the Sloan Digital Sky Survey. The two data sets contain different galaxies at different redshift ranges, and each data set was annotated using a different annotation method. The results show that both data sets show a similar asymmetry in the COSMOS field, which is covered by both telescopes. Fitting the asymmetry of the galaxies to cosine dependence shows a dipole axis with probabilities of $\sim2.8\sigma$ and $\sim7.38\sigma$ in HST and SDSS, respectively. The most likely dipole axis identified in the HST galaxies is at $(\alpha=78^{\rm o},\delta=47^{\rm o})$ and is well within the $1\sigma$ error range compared to the location of the most likely dipole axis in the SDSS galaxies with $z>0.15$ , identified at $(\alpha=71^{\rm o},\delta=61^{\rm o})$ .


2015 ◽  
Vol 8 (1) ◽  
pp. 421-434 ◽  
Author(s):  
M. P. Jensen ◽  
T. Toto ◽  
D. Troyan ◽  
P. E. Ciesielski ◽  
D. Holdridge ◽  
...  

Abstract. The Midlatitude Continental Convective Clouds Experiment (MC3E) took place during the spring of 2011 centered in north-central Oklahoma, USA. The main goal of this field campaign was to capture the dynamical and microphysical characteristics of precipitating convective systems in the US Central Plains. A major component of the campaign was a six-site radiosonde array designed to capture the large-scale variability of the atmospheric state with the intent of deriving model forcing data sets. Over the course of the 46-day MC3E campaign, a total of 1362 radiosondes were launched from the enhanced sonde network. This manuscript provides details on the instrumentation used as part of the sounding array, the data processing activities including quality checks and humidity bias corrections and an analysis of the impacts of bias correction and algorithm assumptions on the determination of convective levels and indices. It is found that corrections for known radiosonde humidity biases and assumptions regarding the characteristics of the surface convective parcel result in significant differences in the derived values of convective levels and indices in many soundings. In addition, the impact of including the humidity corrections and quality controls on the thermodynamic profiles that are used in the derivation of a large-scale model forcing data set are investigated. The results show a significant impact on the derived large-scale vertical velocity field illustrating the importance of addressing these humidity biases.


2020 ◽  
Vol 223 (2) ◽  
pp. 1378-1397
Author(s):  
Rosemary A Renaut ◽  
Jarom D Hogue ◽  
Saeed Vatankhah ◽  
Shuang Liu

SUMMARY We discuss the focusing inversion of potential field data for the recovery of sparse subsurface structures from surface measurement data on a uniform grid. For the uniform grid, the model sensitivity matrices have a block Toeplitz Toeplitz block structure for each block of columns related to a fixed depth layer of the subsurface. Then, all forward operations with the sensitivity matrix, or its transpose, are performed using the 2-D fast Fourier transform. Simulations are provided to show that the implementation of the focusing inversion algorithm using the fast Fourier transform is efficient, and that the algorithm can be realized on standard desktop computers with sufficient memory for storage of volumes up to size n ≈ 106. The linear systems of equations arising in the focusing inversion algorithm are solved using either Golub–Kahan bidiagonalization or randomized singular value decomposition algorithms. These two algorithms are contrasted for their efficiency when used to solve large-scale problems with respect to the sizes of the projected subspaces adopted for the solutions of the linear systems. The results confirm earlier studies that the randomized algorithms are to be preferred for the inversion of gravity data, and for data sets of size m it is sufficient to use projected spaces of size approximately m/8. For the inversion of magnetic data sets, we show that it is more efficient to use the Golub–Kahan bidiagonalization, and that it is again sufficient to use projected spaces of size approximately m/8. Simulations support the presented conclusions and are verified for the inversion of a magnetic data set obtained over the Wuskwatim Lake region in Manitoba, Canada.


2009 ◽  
Vol 2 (1) ◽  
pp. 87-98 ◽  
Author(s):  
C. Lerot ◽  
M. Van Roozendael ◽  
J. van Geffen ◽  
J. van Gent ◽  
C. Fayt ◽  
...  

Abstract. Total O3 columns have been retrieved from six years of SCIAMACHY nadir UV radiance measurements using SDOAS, an adaptation of the GDOAS algorithm previously developed at BIRA-IASB for the GOME instrument. GDOAS and SDOAS have been implemented by the German Aerospace Center (DLR) in the version 4 of the GOME Data Processor (GDP) and in version 3 of the SCIAMACHY Ground Processor (SGP), respectively. The processors are being run at the DLR processing centre on behalf of the European Space Agency (ESA). We first focus on the description of the SDOAS algorithm with particular attention to the impact of uncertainties on the reference O3 absorption cross-sections. Second, the resulting SCIAMACHY total ozone data set is globally evaluated through large-scale comparisons with results from GOME and OMI as well as with ground-based correlative measurements. The various total ozone data sets are found to agree within 2% on average. However, a negative trend of 0.2–0.4%/year has been identified in the SCIAMACHY O3 columns; this probably originates from instrumental degradation effects that have not yet been fully characterized.


2017 ◽  
Vol 14 (4) ◽  
pp. 172988141770907 ◽  
Author(s):  
Hanbo Wu ◽  
Xin Ma ◽  
Zhimeng Zhang ◽  
Haibo Wang ◽  
Yibin Li

Human daily activity recognition has been a hot spot in the field of computer vision for many decades. Despite best efforts, activity recognition in naturally uncontrolled settings remains a challenging problem. Recently, by being able to perceive depth and visual cues simultaneously, RGB-D cameras greatly boost the performance of activity recognition. However, due to some practical difficulties, the publicly available RGB-D data sets are not sufficiently large for benchmarking when considering the diversity of their activities, subjects, and background. This severely affects the applicability of complicated learning-based recognition approaches. To address the issue, this article provides a large-scale RGB-D activity data set by merging five public RGB-D data sets that differ from each other on many aspects such as length of actions, nationality of subjects, or camera angles. This data set comprises 4528 samples depicting 7 action categories (up to 46 subcategories) performed by 74 subjects. To verify the challengeness of the data set, three feature representation methods are evaluated, which are depth motion maps, spatiotemporal depth cuboid similarity feature, and curvature space scale. Results show that the merged large-scale data set is more realistic and challenging and therefore more suitable for benchmarking.


2017 ◽  
Vol 44 (2) ◽  
pp. 203-229 ◽  
Author(s):  
Javier D Fernández ◽  
Miguel A Martínez-Prieto ◽  
Pablo de la Fuente Redondo ◽  
Claudio Gutiérrez

The publication of semantic web data, commonly represented in Resource Description Framework (RDF), has experienced outstanding growth over the last few years. Data from all fields of knowledge are shared publicly and interconnected in active initiatives such as Linked Open Data. However, despite the increasing availability of applications managing large-scale RDF information such as RDF stores and reasoning tools, little attention has been given to the structural features emerging in real-world RDF data. Our work addresses this issue by proposing specific metrics to characterise RDF data. We specifically focus on revealing the redundancy of each data set, as well as common structural patterns. We evaluate the proposed metrics on several data sets, which cover a wide range of designs and models. Our findings provide a basis for more efficient RDF data structures, indexes and compressors.


ESMO Open ◽  
2020 ◽  
Vol 5 (5) ◽  
pp. e000743 ◽  
Author(s):  
Shani Paluch-Shimon ◽  
Nathan I Cherny ◽  
Elisabeth G E de Vries ◽  
Urania Dafni ◽  
Martine J Piccart ◽  
...  

Click here to listen to the PodcastBackgroundThe European Society for Medical Oncology-Magnitude of Clinical Benefit Scale (ESMO-MCBS) is a validated value scale for solid tumour anticancer treatments. Form 1 of the ESMO-MCBS, used to grade therapies with curative intent including adjuvant therapies, has only been evaluated for a limited number of studies. This is the first large-scale field testing in early breast cancer to assess the applicability of the scale to this data set and the reasonableness of derived scores and to identify any shortcomings to be addressed in future modifications of the scale.MethodRepresentative key studies and meta-analyses of the major modalities of adjuvant systemic therapy of breast cancer were identified for each of the major clinical scenarios (HER2-positive, HER2-negative, endocrine-responsive) and were graded with form 1 of the ESMO-MCBS. These generated scores were reviewed by a panel of experts for reasonableness. Shortcomings and issues related to the application of the scale and interpretation of results were identified and critically evaluated.ResultsSixty-five studies were eligible for evaluation: 59 individual studies and 6 meta-analyses. These studies incorporated 101 therapeutic comparisons, 61 of which were scorable. Review of the generated scores indicated that, with few exceptions, they generally reflected contemporary standards of practice. Six shortcomings were identified related to grading based on disease-free survival (DFS), lack of information regarding acute and long-term toxicity and an inability to grade single-arm de-escalation scales.ConclusionsForm 1 of the ESMO-MCBS is a robust tool for the evaluation of the magnitude of benefit studies in early breast cancer. The scale can be further improved by addressing issues related to grading based on DFS, annotating grades with information regarding acute and long-term toxicity and developing an approach to grade single-arm de-escalation studies.


1983 ◽  
Vol 66 ◽  
pp. 411-425
Author(s):  
Frank Hill ◽  
Juri Toomre ◽  
Laurence J. November

AbstractTwo-dimensional power spectra of solar five-minute oscillations display prominent ridge structures in (k, ω) space, where k is the horizontal wavenumber and ω is the temporal frequency. The positions of these ridges in k and ω can be used to probe temperature and velocity structures in the subphotosphere. We have been carrying out a continuing program of observations of five-minute oscillations with the diode array instrument on the vacuum tower telescope at Sacramento Peak Observatory (SPO). We have sought to establish whether power spectra taken on separate days show shifts in ridge locations; these may arise from different velocity and temperature patterns having been brought into our sampling region by solar rotation. Power spectra have been obtained for six days of observations of Doppler velocities using the Mg I λ5173 and Fe I λ5434 spectral lines. Each data set covers 8 to 11 hr in time and samples a region 256″ × 1024″ in spatial extent, with a spatial resolution of 2″ and temporal sampling of 65 s. We have detected shifts in ridge locations between certain data sets which are statistically significant. The character of these displacements when analyzed in terms of eastward and westward propagating waves implies that changes have occurred in both temperature and horizontal velocity fields underlying our observing window. We estimate the magnitude of the velocity changes to be on the order of 100 m s -1; we may be detecting the effects of large-scale convection akin to giant cells.


2012 ◽  
Vol 3 ◽  
pp. 747-758 ◽  
Author(s):  
Blake W Erickson ◽  
Séverine Coquoz ◽  
Jonathan D Adams ◽  
Daniel J Burns ◽  
Georg E Fantner

Modern high-speed atomic force microscopes generate significant quantities of data in a short amount of time. Each image in the sequence has to be processed quickly and accurately in order to obtain a true representation of the sample and its changes over time. This paper presents an automated, adaptive algorithm for the required processing of AFM images. The algorithm adaptively corrects for both common one-dimensional distortions as well as the most common two-dimensional distortions. This method uses an iterative thresholded processing algorithm for rapid and accurate separation of background and surface topography. This separation prevents artificial bias from topographic features and ensures the best possible coherence between the different images in a sequence. This method is equally applicable to all channels of AFM data, and can process images in seconds.


2019 ◽  
Vol 34 (9) ◽  
pp. 1369-1383 ◽  
Author(s):  
Dirk Diederen ◽  
Ye Liu

Abstract With the ongoing development of distributed hydrological models, flood risk analysis calls for synthetic, gridded precipitation data sets. The availability of large, coherent, gridded re-analysis data sets in combination with the increase in computational power, accommodates the development of new methodology to generate such synthetic data. We tracked moving precipitation fields and classified them using self-organising maps. For each class, we fitted a multivariate mixture model and generated a large set of synthetic, coherent descriptors, which we used to reconstruct moving synthetic precipitation fields. We introduced randomness in the original data set by replacing the observed precipitation fields in the original data set with the synthetic precipitation fields. The output is a continuous, gridded, hourly precipitation data set of a much longer duration, containing physically plausible and spatio-temporally coherent precipitation events. The proposed methodology implicitly provides an important improvement in the spatial coherence of precipitation extremes. We investigate the issue of unrealistic, sudden changes on the grid and demonstrate how a dynamic spatio-temporal generator can provide spatial smoothness in the probability distribution parameters and hence in the return level estimates.


Sign in / Sign up

Export Citation Format

Share Document