Foreground modelling via Gaussian process regression: an application to HERA data

Abhik Ghosh; Florent Mertens; Gianni Bernardi; Mário G Santos; Nicholas S Kern; Christopher L Carilli; Trienko L Grobler; Léon V E Koopmans; Daniel C Jacobs; Adrian Liu; Aaron R Parsons; Miguel F Morales; James E Aguirre; Joshua S Dillon; Bryna J Hazelton; Oleg M Smirnov; Bharat K Gehlot; Siyanda Matika; Paul Alexander; Zaki S Ali; Adam P Beardsley; Roshan K Benefo; Tashalee S Billings; Judd D Bowman; Richard F Bradley; Carina Cheng; Paul M Chichura; David R DeBoer; Eloy de Lera Acedo; Aaron Ewall-Wice; Gcobisa Fadana; Nicolas Fagnoni; Austin F Fortino; Randall Fritz; Steve R Furlanetto; Samavarti Gallardo; Brian Glendenning; Deepthi Gorthi; Bradley Greig; Jasper Grobbelaar; Jack Hickish; Alec Josaitis; Austin Julius; Amy S Igarashi; MacCalvin Kariseb; Saul A Kohn; Matthew Kolopanis; Telalo Lekalake; Anita Loots; David MacMahon; Lourence Malan; Cresshim Malgas; Matthys Maree; Zachary E Martinot; Nathan Mathison; Eunice Matsetela; Andrei Mesinger; Abraham R Neben; Bojan Nikolic; Chuneeta D Nunhokee; Nipanjana Patra; Samantha Pieterse; Nima Razavi-Ghods; Jon Ringuette; James Robnett; Kathryn Rosie; Raddwine Sell; Craig Smith; Angelo Syce; Max Tegmark; Nithyanandan Thyagarajan; Peter K G Williams; Haoxuan Zheng

doi:10.1093/mnras/staa1331

Foreground modelling via Gaussian process regression: an application to HERA data

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa1331 ◽

2020 ◽

Vol 495 (3) ◽

pp. 2813-2826

Author(s):

Abhik Ghosh ◽

Florent Mertens ◽

Gianni Bernardi ◽

Mário G Santos ◽

Nicholas S Kern ◽

...

Keyword(s):

Power Spectrum ◽

Gaussian Process ◽

Spectral Properties ◽

Real Life ◽

Gaussian Process Regression ◽

Line Of Sight ◽

Periodic Signal ◽

Variance Model ◽

Systematic Effects ◽

Residual Power

ABSTRACT The key challenge in the observation of the redshifted 21-cm signal from cosmic reionization is its separation from the much brighter foreground emission. Such separation relies on the different spectral properties of the two components, although, in real life, the foreground intrinsic spectrum is often corrupted by the instrumental response, inducing systematic effects that can further jeopardize the measurement of the 21-cm signal. In this paper, we use Gaussian Process Regression to model both foreground emission and instrumental systematics in ∼2 h of data from the Hydrogen Epoch of Reionization Array. We find that a simple co-variance model with three components matches the data well, giving a residual power spectrum with white noise properties. These consist of an ‘intrinsic’ and instrumentally corrupted component with a coherence scale of 20 and 2.4 MHz, respectively (dominating the line-of-sight power spectrum over scales k∥ ≤ 0.2 h cMpc−1) and a baseline-dependent periodic signal with a period of ∼1 MHz (dominating over k∥ ∼ 0.4–0.8 h cMpc−1), which should be distinguishable from the 21-cm Epoch of Reionization signal whose typical coherence scale is ∼0.8 MHz.

Download Full-text

Understanding the impact of Light cone effect on the EoR/CD 21-cm power spectrum

Proceedings of the International Astronomical Union ◽

10.1017/s1743921318000637 ◽

2017 ◽

Vol 12 (S333) ◽

pp. 12-17

Author(s):

Kanan K. Datta ◽

Rajesh Mondal ◽

Raghunath Ghara ◽

Somnath Bharadwaj ◽

T. Roy Choudhury

Keyword(s):

Power Spectrum ◽

Order Statistics ◽

Light Cone ◽

Second Order ◽

Line Of Sight ◽

Periodic Signal ◽

Angular Power Spectrum ◽

Second Order Statistics ◽

The Impact

AbstractRedshifted HI 21-cm signal from the cosmic dawn and epoch of reionization evolve considerably along the LoS. We study the impact of this evolution (so called the light cone effect) on the HI 21-cm power spectrum. It is found that the LC effect has a significant impact on the 3D power spectrum and the change could be up to a factor of few. The LC effect is particularly strong during the cosmic dawn near the ‘peaks’ and ‘dips’ in the power spectrum when plotted with redshift. We also show that the 3D power spectrum, which could fully describe ergodic and periodic signal, losses out some information regarding the second order statistics of the signal as the EoR/CD 21-cm signal is non-ergodic and non-periodic along the line of sight. We show that the multi-frequency angular power spectrum (MAPS)${\mathcal {C}}_{\ell }(\nu _1, \nu _2)$captures all the information regarding the second order statistics of the signal even in the presence of the LC effect.

Download Full-text

Gaussian process foreground subtraction and power spectrum estimation for 21 cm cosmology

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa3736 ◽

2020 ◽

Vol 501 (1) ◽

pp. 1463-1480

Author(s):

Nicholas S Kern ◽

Adrian Liu

Keyword(s):

Power Spectrum ◽

Gaussian Process ◽

Signal To Noise Ratio ◽

Low Frequency ◽

Gaussian Process Regression ◽

Spectrum Estimation ◽

Signal Loss ◽

Intensity Mapping ◽

Quadratic Estimator ◽

Window Functions

ABSTRACT One of the primary challenges in enabling the scientific potential of 21 cm intensity mapping at the epoch of reionization (EoR) is the separation of astrophysical foreground contamination. Recent works have claimed that Gaussian process regression (GPR) can robustly perform this separation, particularly at low Fourier k wavenumbers where the EoR signal reaches its peak signal-to-noise ratio. We revisit this topic by casting GPR foreground subtraction (GPR-FS) into the quadratic estimator formalism, thereby putting its statistical properties on stronger theoretical footing. We find that GPR-FS can distort the window functions at these low k modes, which, without proper decorrelation, make it difficult to probe the EoR power spectrum. Incidentally, we also show that GPR-FS is in fact closely related to the widely studied inverse covariance weighting of the optimal quadratic estimator. As a case study, we look at recent power spectrum upper limits from the Low-Frequency Array (LOFAR) that utilized GPR-FS. We pay close attention to their normalization scheme, showing that it is particularly sensitive to signal loss when the EoR covariance is misestimated. This has possible ramifications for recent astrophysical interpretations of the LOFAR limits, because many of the EoR models ruled out do not fall within the bounds of the covariance models explored by LOFAR. Being more robust to this bias, we conclude that the quadratic estimator is a more natural framework for implementing GPR-FS and computing the 21 cm power spectrum.

Download Full-text

Robust Foregrounds Removal for 21-cm Experiments

Proceedings of the International Astronomical Union ◽

10.1017/s1743921318000546 ◽

2017 ◽

Vol 12 (S333) ◽

pp. 284-287

Author(s):

F. Mertens ◽

A. Ghosh ◽

L. V. E. Koopmans

Keyword(s):

Maximum Likelihood ◽

Power Spectrum ◽

Gaussian Process ◽

Structure Formation ◽

Early Universe ◽

Direct Detection ◽

Gaussian Process Regression ◽

New Method ◽

Imaging Method ◽

Order Of Magnitude

AbstractDirect detection of the Epoch of Reionization via the redshifted 21-cm line will have unprecedented implications on the study of structure formation in the early Universe. To fulfill this promise current and future 21-cm experiments will need to detect the weak 21-cm signal over foregrounds several order of magnitude greater. This requires accurate modeling of the galactic and extragalactic emission and of its contaminants due to instrument chromaticity, ionosphere and imperfect calibration. To solve for this complex modeling, we propose a new method based on Gaussian Process Regression (GPR) which is able to cleanly separate the cosmological signal from most of the foregrounds contaminants. We also propose a new imaging method based on a maximum likelihood framework which solves for the interferometric equation directly on the sphere. Using this method, chromatic effects causing the so-called “wedge” are effectively eliminated (i.e. deconvolved) in the cylindrical (k⊥, k∥) power spectrum.

Download Full-text

A High-Resolution Foreground Model for the MWA EoR1 Field: Model and Implications for EoR Power Spectrum Analysis

Publications of the Astronomical Society of Australia ◽

10.1017/pasa.2017.26 ◽

2017 ◽

Vol 34 ◽

Cited By ~ 15

Author(s):

P. Procopio ◽

R. B. Wayth ◽

J. Line ◽

C. M. Trott ◽

H. T. Intema ◽

...

Keyword(s):

Power Spectrum ◽

Large Scale ◽

Field Model ◽

Single Point ◽

Neutral Hydrogen ◽

Point Sources ◽

Systematic Effects ◽

Calibration Accuracy ◽

Compact Sources ◽

Residual Power

AbstractThe current generation of experiments aiming to detect the neutral hydrogen signal from the Epoch of Reionisation (EoR) is likely to be limited by systematic effects associated with removing foreground sources from target fields. In this paper, we develop a model for the compact foreground sources in one of the target fields of the MWA’s EoR key science experiment: the ‘EoR1’ field. The model is based on both the MWA’s GLEAM survey and GMRT 150 MHz data from the TGSS survey, the latter providing higher angular resolution and better astrometric accuracy for compact sources than is available from the MWA alone. The model contains 5 049 sources, some of which have complicated morphology in MWA data, Fornax A being the most complex. The higher resolution data show that 13% of sources that appear point-like to the MWA have complicated morphology such as double and quad structure, with a typical separation of 33 arcsec. We derive an analytic expression for the error introduced into the EoR two-dimensional power spectrum due to peeling close double sources as single point sources and show that for the measured source properties, the error in the power spectrum is confined to highk⊥modes that do not affect the overall result for the large-scale cosmological signal of interest. The brightest 10 mis-modelled sources in the field contribute 90% of the power bias in the data, suggesting that it is most critical to improve the models of the brightest sources. With this hybrid model, we reprocess data from the EoR1 field and show a maximum of 8% improved calibration accuracy and a factor of two reduction in residual power ink-space from peeling these sources. Implications for future EoR experiments including the SKA are discussed in relation to the improvements obtained.

Download Full-text

Gaussian Process Regression for foreground removal in HI intensity mapping experiments

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/stab2594 ◽

2021 ◽

Author(s):

Paula S Soares ◽

Catherine A Watkinson ◽

Steven Cunnington ◽

Alkistis Pourtsidou

Keyword(s):

Power Spectrum ◽

Gaussian Process ◽

Low Frequency ◽

Gaussian Process Regression ◽

Principal Component ◽

Frequency Range ◽

Intensity Mapping ◽

Removal Technique ◽

First Time ◽

Foreground Removal

Abstract We apply for the first time Gaussian Process Regression (GPR) as a foreground removal technique in the context of single-dish, low redshift H i intensity mapping, and present an open-source python toolkit for doing so. We use MeerKAT and SKA1-MID-like simulations of 21cm foregrounds (including polarisation leakage), H i cosmological signal and instrumental noise. We find that it is possible to use GPR as a foreground removal technique in this context, and that it is better suited in some cases to recover the H i power spectrum than Principal Component Analysis (PCA), especially on small scales. GPR is especially good at recovering the radial power spectrum, outperforming PCA when considering the full bandwidth of our data. Both methods are worse at recovering the transverse power spectrum, since they rely on frequency-only covariance information. When halving our data along frequency, we find that GPR performs better in the low frequency range, where foregrounds are brighter. It performs worse than PCA when frequency channels are missing, to emulate RFI flagging. We conclude that GPR is an excellent foreground removal option for the case of single-dish, low redshift H i intensity mapping in the absence of missing frequency channels. Our python toolkit gpr4im and the data used in this analysis are publicly available on GitHub.

Download Full-text

Using Gaussian Process Regression to Integrate the Transition Structure Factor Curve for the Many-Body Correlation Energy

10.26226/morressier.5fa409874d4e91fe5c54b97a ◽

2020 ◽

Author(s):

Laura Weiler

Keyword(s):

Gaussian Process ◽

Structure Factor ◽

Correlation Energy ◽

Gaussian Process Regression ◽

Many Body ◽

Transition Structure ◽

The Many ◽

Structure Factor Curve ◽

Body Correlation

Download Full-text

Exchange Spin Coupling from Gaussian Process Regression

10.26434/chemrxiv.12589541.v3 ◽

2020 ◽

Author(s):

Marc Philipp Bahlke ◽

Natnael Mogos ◽

Jonny Proppe ◽

Carmen Herrmann

Keyword(s):

Machine Learning ◽

Gaussian Process ◽

Gaussian Process Regression ◽

Molecular Magnets ◽

Molecular Structures ◽

Spin Coupling ◽

Structure Property ◽

Data Set ◽

Uncertainty Estimates

Heisenberg exchange spin coupling between metal centers is essential for describing and understanding the electronic structure of many molecular catalysts, metalloenzymes, and molecular magnets for potential application in information technology. We explore the machine-learnability of exchange spin coupling, which has not been studied yet. We employ Gaussian process regression since it can potentially deal with small training sets (as likely associated with the rather complex molecular structures required for exploring spin coupling) and since it provides uncertainty estimates (“error bars”) along with predicted values. We compare a range of descriptors and kernels for 257 small dicopper complexes and find that a simple descriptor based on chemical intuition, consisting only of copper-bridge angles and copper-copper distances, clearly outperforms several more sophisticated descriptors when it comes to extrapolating towards larger experimentally relevant complexes. Exchange spin coupling is similarly easy to learn as the polarizability, while learning dipole moments is much harder. The strength of the sophisticated descriptors lies in their ability to linearize structure-property relationships, to the point that a simple linear ridge regression performs just as well as the kernel-based machine-learning model for our small dicopper data set. The superior extrapolation performance of the simple descriptor is unique to exchange spin coupling, reinforcing the crucial role of choosing a suitable descriptor, and highlighting the interesting question of the role of chemical intuition vs. systematic or automated selection of features for machine learning in chemistry and material science.

Download Full-text

SAMPL6 Challenge Results from pKa Predictions Based on a General Gaussian Process Model

10.26434/chemrxiv.6406505.v2 ◽

2018 ◽

Author(s):

Caitlin C. Bannan ◽

David Mobley ◽

A. Geoff Skillman

Keyword(s):

Gaussian Process ◽

Process Model ◽

Molecular Graph ◽

Gaussian Process Regression ◽

Ionization State ◽

Training Set ◽

Physiochemical Properties ◽

Quantile Plots ◽

Physical And Chemical ◽

Good Agreement

<div>A variety of fields would benefit from accurate pK<sub>a</sub> predictions, especially drug design due to the affect a change in ionization state can have on a molecules physiochemical properties.</div><div>Participants in the recent SAMPL6 blind challenge were asked to submit predictions for microscopic and macroscopic pK<sub>a</sub>s of 24 drug like small molecules.</div><div>We recently built a general model for predicting pK<sub>a</sub>s using a Gaussian process regression trained using physical and chemical features of each ionizable group.</div><div>Our pipeline takes a molecular graph and uses the OpenEye Toolkits to calculate features describing the removal of a proton.</div><div>These features are fed into a Scikit-learn Gaussian process to predict microscopic pK<sub>a</sub>s which are then used to analytically determine macroscopic pK<sub>a</sub>s.</div><div>Our Gaussian process is trained on a set of 2,700 macroscopic pK<sub>a</sub>s from monoprotic and select diprotic molecules.</div><div>Here, we share our results for microscopic and macroscopic predictions in the SAMPL6 challenge.</div><div>Overall, we ranked in the middle of the pack compared to other participants, but our fairly good agreement with experiment is still promising considering the challenge molecules are chemically diverse and often polyprotic while our training set is predominately monoprotic.</div><div>Of particular importance to us when building this model was to include an uncertainty estimate based on the chemistry of the molecule that would reflect the likely accuracy of our prediction. </div><div>Our model reports large uncertainties for the molecules that appear to have chemistry outside our domain of applicability, along with good agreement in quantile-quantile plots, indicating it can predict its own accuracy.</div><div>The challenge highlighted a variety of means to improve our model, including adding more polyprotic molecules to our training set and more carefully considering what functional groups we do or do not identify as ionizable. </div>

Download Full-text

Gaussian Process Regression for Estimating Wind Speed From X-band Marine Radar Images

OCEANS 2018 MTS/IEEE Charleston ◽

10.1109/oceans.2018.8604842 ◽

2018 ◽

Author(s):

Xinwei Chen ◽

Weimin Huang

Keyword(s):

Wind Speed ◽

Gaussian Process ◽

Gaussian Process Regression ◽

Radar Images ◽

X Band ◽

Marine Radar

Download Full-text

Direct quantum dynamics using variational Gaussian wavepackets and Gaussian process regression

The Journal of Chemical Physics ◽

10.1063/1.5086358 ◽

2019 ◽

Vol 150 (4) ◽

pp. 041101 ◽

Cited By ~ 13

Author(s):

Iakov Polyak ◽

Gareth W. Richings ◽

Scott Habershon ◽

Peter J. Knowles

Keyword(s):

Gaussian Process ◽

Quantum Dynamics ◽

Gaussian Process Regression

Download Full-text