scholarly journals Foreground modelling via Gaussian process regression: an application to HERA data

2020 ◽  
Vol 495 (3) ◽  
pp. 2813-2826
Author(s):  
Abhik Ghosh ◽  
Florent Mertens ◽  
Gianni Bernardi ◽  
Mário G Santos ◽  
Nicholas S Kern ◽  
...  

ABSTRACT The key challenge in the observation of the redshifted 21-cm signal from cosmic reionization is its separation from the much brighter foreground emission. Such separation relies on the different spectral properties of the two components, although, in real life, the foreground intrinsic spectrum is often corrupted by the instrumental response, inducing systematic effects that can further jeopardize the measurement of the 21-cm signal. In this paper, we use Gaussian Process Regression to model both foreground emission and instrumental systematics in ∼2 h of data from the Hydrogen Epoch of Reionization Array. We find that a simple co-variance model with three components matches the data well, giving a residual power spectrum with white noise properties. These consist of an ‘intrinsic’ and instrumentally corrupted component with a coherence scale of 20 and 2.4 MHz, respectively (dominating the line-of-sight power spectrum over scales k∥ ≤ 0.2 h cMpc−1) and a baseline-dependent periodic signal with a period of ∼1 MHz (dominating over k∥ ∼ 0.4–0.8 h cMpc−1), which should be distinguishable from the 21-cm Epoch of Reionization signal whose typical coherence scale is ∼0.8 MHz.

2017 ◽  
Vol 12 (S333) ◽  
pp. 12-17
Author(s):  
Kanan K. Datta ◽  
Rajesh Mondal ◽  
Raghunath Ghara ◽  
Somnath Bharadwaj ◽  
T. Roy Choudhury

AbstractRedshifted HI 21-cm signal from the cosmic dawn and epoch of reionization evolve considerably along the LoS. We study the impact of this evolution (so called the light cone effect) on the HI 21-cm power spectrum. It is found that the LC effect has a significant impact on the 3D power spectrum and the change could be up to a factor of few. The LC effect is particularly strong during the cosmic dawn near the ‘peaks’ and ‘dips’ in the power spectrum when plotted with redshift. We also show that the 3D power spectrum, which could fully describe ergodic and periodic signal, losses out some information regarding the second order statistics of the signal as the EoR/CD 21-cm signal is non-ergodic and non-periodic along the line of sight. We show that the multi-frequency angular power spectrum (MAPS)${\mathcal {C}}_{\ell }(\nu _1, \nu _2)$captures all the information regarding the second order statistics of the signal even in the presence of the LC effect.


2020 ◽  
Vol 501 (1) ◽  
pp. 1463-1480
Author(s):  
Nicholas S Kern ◽  
Adrian Liu

ABSTRACT One of the primary challenges in enabling the scientific potential of 21 cm intensity mapping at the epoch of reionization (EoR) is the separation of astrophysical foreground contamination. Recent works have claimed that Gaussian process regression (GPR) can robustly perform this separation, particularly at low Fourier k wavenumbers where the EoR signal reaches its peak signal-to-noise ratio. We revisit this topic by casting GPR foreground subtraction (GPR-FS) into the quadratic estimator formalism, thereby putting its statistical properties on stronger theoretical footing. We find that GPR-FS can distort the window functions at these low k modes, which, without proper decorrelation, make it difficult to probe the EoR power spectrum. Incidentally, we also show that GPR-FS is in fact closely related to the widely studied inverse covariance weighting of the optimal quadratic estimator. As a case study, we look at recent power spectrum upper limits from the Low-Frequency Array (LOFAR) that utilized GPR-FS. We pay close attention to their normalization scheme, showing that it is particularly sensitive to signal loss when the EoR covariance is misestimated. This has possible ramifications for recent astrophysical interpretations of the LOFAR limits, because many of the EoR models ruled out do not fall within the bounds of the covariance models explored by LOFAR. Being more robust to this bias, we conclude that the quadratic estimator is a more natural framework for implementing GPR-FS and computing the 21 cm power spectrum.


2017 ◽  
Vol 12 (S333) ◽  
pp. 284-287
Author(s):  
F. Mertens ◽  
A. Ghosh ◽  
L. V. E. Koopmans

AbstractDirect detection of the Epoch of Reionization via the redshifted 21-cm line will have unprecedented implications on the study of structure formation in the early Universe. To fulfill this promise current and future 21-cm experiments will need to detect the weak 21-cm signal over foregrounds several order of magnitude greater. This requires accurate modeling of the galactic and extragalactic emission and of its contaminants due to instrument chromaticity, ionosphere and imperfect calibration. To solve for this complex modeling, we propose a new method based on Gaussian Process Regression (GPR) which is able to cleanly separate the cosmological signal from most of the foregrounds contaminants. We also propose a new imaging method based on a maximum likelihood framework which solves for the interferometric equation directly on the sphere. Using this method, chromatic effects causing the so-called “wedge” are effectively eliminated (i.e. deconvolved) in the cylindrical (k⊥, k∥) power spectrum.


Author(s):  
P. Procopio ◽  
R. B. Wayth ◽  
J. Line ◽  
C. M. Trott ◽  
H. T. Intema ◽  
...  

AbstractThe current generation of experiments aiming to detect the neutral hydrogen signal from the Epoch of Reionisation (EoR) is likely to be limited by systematic effects associated with removing foreground sources from target fields. In this paper, we develop a model for the compact foreground sources in one of the target fields of the MWA’s EoR key science experiment: the ‘EoR1’ field. The model is based on both the MWA’s GLEAM survey and GMRT 150 MHz data from the TGSS survey, the latter providing higher angular resolution and better astrometric accuracy for compact sources than is available from the MWA alone. The model contains 5 049 sources, some of which have complicated morphology in MWA data, Fornax A being the most complex. The higher resolution data show that 13% of sources that appear point-like to the MWA have complicated morphology such as double and quad structure, with a typical separation of 33 arcsec. We derive an analytic expression for the error introduced into the EoR two-dimensional power spectrum due to peeling close double sources as single point sources and show that for the measured source properties, the error in the power spectrum is confined to highk⊥modes that do not affect the overall result for the large-scale cosmological signal of interest. The brightest 10 mis-modelled sources in the field contribute 90% of the power bias in the data, suggesting that it is most critical to improve the models of the brightest sources. With this hybrid model, we reprocess data from the EoR1 field and show a maximum of 8% improved calibration accuracy and a factor of two reduction in residual power ink-space from peeling these sources. Implications for future EoR experiments including the SKA are discussed in relation to the improvements obtained.


Author(s):  
Paula S Soares ◽  
Catherine A Watkinson ◽  
Steven Cunnington ◽  
Alkistis Pourtsidou

Abstract We apply for the first time Gaussian Process Regression (GPR) as a foreground removal technique in the context of single-dish, low redshift H i intensity mapping, and present an open-source python toolkit for doing so. We use MeerKAT and SKA1-MID-like simulations of 21cm foregrounds (including polarisation leakage), H i cosmological signal and instrumental noise. We find that it is possible to use GPR as a foreground removal technique in this context, and that it is better suited in some cases to recover the H i power spectrum than Principal Component Analysis (PCA), especially on small scales. GPR is especially good at recovering the radial power spectrum, outperforming PCA when considering the full bandwidth of our data. Both methods are worse at recovering the transverse power spectrum, since they rely on frequency-only covariance information. When halving our data along frequency, we find that GPR performs better in the low frequency range, where foregrounds are brighter. It performs worse than PCA when frequency channels are missing, to emulate RFI flagging. We conclude that GPR is an excellent foreground removal option for the case of single-dish, low redshift H i intensity mapping in the absence of missing frequency channels. Our python toolkit gpr4im and the data used in this analysis are publicly available on GitHub.


2020 ◽  
Author(s):  
Marc Philipp Bahlke ◽  
Natnael Mogos ◽  
Jonny Proppe ◽  
Carmen Herrmann

Heisenberg exchange spin coupling between metal centers is essential for describing and understanding the electronic structure of many molecular catalysts, metalloenzymes, and molecular magnets for potential application in information technology. We explore the machine-learnability of exchange spin coupling, which has not been studied yet. We employ Gaussian process regression since it can potentially deal with small training sets (as likely associated with the rather complex molecular structures required for exploring spin coupling) and since it provides uncertainty estimates (“error bars”) along with predicted values. We compare a range of descriptors and kernels for 257 small dicopper complexes and find that a simple descriptor based on chemical intuition, consisting only of copper-bridge angles and copper-copper distances, clearly outperforms several more sophisticated descriptors when it comes to extrapolating towards larger experimentally relevant complexes. Exchange spin coupling is similarly easy to learn as the polarizability, while learning dipole moments is much harder. The strength of the sophisticated descriptors lies in their ability to linearize structure-property relationships, to the point that a simple linear ridge regression performs just as well as the kernel-based machine-learning model for our small dicopper data set. The superior extrapolation performance of the simple descriptor is unique to exchange spin coupling, reinforcing the crucial role of choosing a suitable descriptor, and highlighting the interesting question of the role of chemical intuition vs. systematic or automated selection of features for machine learning in chemistry and material science.


2018 ◽  
Author(s):  
Caitlin C. Bannan ◽  
David Mobley ◽  
A. Geoff Skillman

<div>A variety of fields would benefit from accurate pK<sub>a</sub> predictions, especially drug design due to the affect a change in ionization state can have on a molecules physiochemical properties.</div><div>Participants in the recent SAMPL6 blind challenge were asked to submit predictions for microscopic and macroscopic pK<sub>a</sub>s of 24 drug like small molecules.</div><div>We recently built a general model for predicting pK<sub>a</sub>s using a Gaussian process regression trained using physical and chemical features of each ionizable group.</div><div>Our pipeline takes a molecular graph and uses the OpenEye Toolkits to calculate features describing the removal of a proton.</div><div>These features are fed into a Scikit-learn Gaussian process to predict microscopic pK<sub>a</sub>s which are then used to analytically determine macroscopic pK<sub>a</sub>s.</div><div>Our Gaussian process is trained on a set of 2,700 macroscopic pK<sub>a</sub>s from monoprotic and select diprotic molecules.</div><div>Here, we share our results for microscopic and macroscopic predictions in the SAMPL6 challenge.</div><div>Overall, we ranked in the middle of the pack compared to other participants, but our fairly good agreement with experiment is still promising considering the challenge molecules are chemically diverse and often polyprotic while our training set is predominately monoprotic.</div><div>Of particular importance to us when building this model was to include an uncertainty estimate based on the chemistry of the molecule that would reflect the likely accuracy of our prediction. </div><div>Our model reports large uncertainties for the molecules that appear to have chemistry outside our domain of applicability, along with good agreement in quantile-quantile plots, indicating it can predict its own accuracy.</div><div>The challenge highlighted a variety of means to improve our model, including adding more polyprotic molecules to our training set and more carefully considering what functional groups we do or do not identify as ionizable. </div>


2019 ◽  
Vol 150 (4) ◽  
pp. 041101 ◽  
Author(s):  
Iakov Polyak ◽  
Gareth W. Richings ◽  
Scott Habershon ◽  
Peter J. Knowles

Sign in / Sign up

Export Citation Format

Share Document