The square kilometre array: challenges of distributed operations and big data rates

Unlike optical telescopes, radio interferometers do not image the sky directly but require specialized image formation algorithms. For the Square Kilometre Array (SKA), the computational requirements of this image formation are extremely demanding due to the huge data rates produced by the telescope. This processing will be performed by the SKA Science Data Processor facilities and a network of SKA Regional Centres, which must not only deal with SKA-scale data volumes but also with stringent science-driven image fidelity requirements. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.

Download Full-text

Modelling Galaxy Populations in the Era of Big Data

Proceedings of the International Astronomical Union ◽

10.1017/s1743921314010710 ◽

2014 ◽

Vol 10 (S306) ◽

pp. 304-306

Author(s):

S. G. Murray ◽

C. Power ◽

A. S. G. Robotham

Keyword(s):

Dark Matter ◽

Big Data ◽

Open Source ◽

Recent Work ◽

Mass Function ◽

Dark Matter Halo ◽

Square Kilometre Array ◽

Galaxy Surveys ◽

Software Frameworks ◽

Galaxy Populations

AbstractThe coming decade will witness a deluge of data from next generation galaxy surveys such as the Square Kilometre Array and Euclid. How can we optimally and robustly analyse these data to maximise scientific returns from these surveys? Here we discuss recent work in developing both the conceptual and software frameworks for carrying out such analyses and their application to the dark matter halo mass function. We summarise what we have learned about the HMF from the last 10 years of precision CMB data using the open-source HMFcalc framework, before discussing how this framework is being extended to the full Halo Model.

Download Full-text

Science Pipelines for the Square Kilometre Array

10.20944/preprints201810.0115.v1 ◽

2018 ◽

Author(s):

Jamie Farnes ◽

Ben Mort ◽

Fred Dulwich ◽

Stef Salvini ◽

Wes Armour

Keyword(s):

Big Data ◽

Large Scale ◽

Technological Development ◽

Large Data ◽

Process Data ◽

Polarization Data ◽

Square Kilometre Array ◽

Sky Survey ◽

Scientific Results ◽

The Universe

The Square Kilometre Array (SKA) will be both the largest radio telescope ever constructed and the largest Big Data project in the known Universe. The first phase of the project will generate on the order of 5 zettabytes of data per year. A critical task for the SKA will be its ability to process data for science, which will need to be conducted by science pipelines. Together with polarization data from the LOFAR Multifrequency Snapshot Sky Survey (MSSS), we have been developing a realistic SKA-like science pipeline that can handle the large data volumes generated by LOFAR at 150 MHz. The pipeline uses task-based parallelism to image, detect sources, and perform Faraday Tomography across the entire LOFAR sky. The project thereby provides a unique opportunity to contribute to the technological development of the SKA telescope, while simultaneously enabling cutting-edge scientific results. In this paper, we provide an update on current efforts to develop a science pipeline that can enable tight constraints on the magnetised large-scale structure of the Universe.

Download Full-text

Big Data Analysis and Implementation in Different Areas Using IoT

International Journal of Hyperconnectivity and the Internet of Things ◽

10.4018/ijhiot.2017070102 ◽

2017 ◽

Vol 1 (2) ◽

pp. 12-25

Author(s):

Aqeel ur Rehman ◽

Muhammad Fahad ◽

Rafi Ullah ◽

Faisal Abdullah

Keyword(s):

Data Mining ◽

Big Data ◽

Data Analysis ◽

Big Data Analysis ◽

Healthcare Applications ◽

New Techniques ◽

Standard Data ◽

Huge Data ◽

Data Rates ◽

Definition Of

This article describes how in IoT, data management is a major issue because of communication among billions of electronic devices, which generate the huge dataset. Due to the unavailability of any standard, data analysis on such a large amount of data is a complex task. There should be a definition of IoT-based data to find out what is available and its applicable solutions. Such a study also directs the need for new techniques to cope up with such challenges. Due to the heterogeneity of connected nodes, different data rates, and formats, it is a huge challenge to deal with such a variety of data. As IoT is providing processing nodes in the form of smart nodes; it is presenting a good platform to support the big data study. In this article, the characteristics of data mining requirements for data mining analysis are highlighted. The associated challenges of facts generation, as well as the plausible suitable platform of such huge data analysis is also underlined. The application of IoT to support big data analysis in healthcare applications is also presented.

Download Full-text

Science Pipelines for the Square Kilometre Array

10.20944/preprints201810.0115.v2 ◽

2018 ◽

Author(s):

Jamie Farnes ◽

Ben Mort ◽

Fred Dulwich ◽

Stef Salvini ◽

Wes Armour

Keyword(s):

Big Data ◽

Large Scale ◽

Technological Development ◽

Large Data ◽

Process Data ◽

Polarization Data ◽

Square Kilometre Array ◽

Sky Survey ◽

Scientific Results ◽

The Universe

The Square Kilometre Array (SKA) will be both the largest radio telescope ever constructed and the largest Big Data project in the known Universe. The first phase of the project will generate on the order of 5 zettabytes of data per year. A critical task for the SKA will be its ability to process data for science, which will need to be conducted by science pipelines. Together with polarization data from the LOFAR Multifrequency Snapshot Sky Survey (MSSS), we have been developing a realistic SKA-like science pipeline that can handle the large data volumes generated by LOFAR at 150 MHz. The pipeline uses task-based parallelism to image, detect sources, and perform Faraday Tomography across the entire LOFAR sky. The project thereby provides a unique opportunity to contribute to the technological development of the SKA telescope, while simultaneously enabling cutting-edge scientific results. In this paper, we provide an update on current efforts to develop a science pipeline that can enable tight constraints on the magnetised large-scale structure of the Universe.

Download Full-text

Science Pipelines for the Square Kilometre Array

Galaxies ◽

10.3390/galaxies6040120 ◽

2018 ◽

Vol 6 (4) ◽

pp. 120 ◽

Cited By ~ 6

Author(s):

Jamie Farnes ◽

Ben Mort ◽

Fred Dulwich ◽

Stef Salvini ◽

Wes Armour

Keyword(s):

Big Data ◽

Large Scale ◽

Technological Development ◽

Large Data ◽

Process Data ◽

Polarization Data ◽

Square Kilometre Array ◽

Sky Survey ◽

Scientific Results ◽

The Universe

The Square Kilometre Array (SKA) will be both the largest radio telescope ever constructed and the largest Big Data project in the known Universe. The first phase of the project will generate on the order of five zettabytes of data per year. A critical task for the SKA will be its ability to process data for science, which will need to be conducted by science pipelines. Together with polarization data from the LOFAR Multifrequency Snapshot Sky Survey (MSSS), we have been developing a realistic SKA-like science pipeline that can handle the large data volumes generated by LOFAR at 150 MHz. The pipeline uses task-based parallelism to image, detect sources and perform Faraday tomography across the entire LOFAR sky. The project thereby provides a unique opportunity to contribute to the technological development of the SKA telescope, while simultaneously enabling cutting-edge scientific results. In this paper, we provide an update on current efforts to develop a science pipeline that can enable tight constraints on the magnetised large-scale structure of the Universe.

Download Full-text

Big Data Analysis in IoT

Handbook of Research on Trends and Future Directions in Big Data and Web Intelligence - Advances in Data Mining and Database Management ◽

10.4018/978-1-4666-8505-5.ch015 ◽

2015 ◽

pp. 313-327

Author(s):

Aqeel-ur Rehman ◽

Rafi Ullah ◽

Faisal Abdullah

Keyword(s):

Big Data ◽

Data Analysis ◽

Data Management ◽

Big Data Analysis ◽

Data Generation ◽

New Techniques ◽

Huge Data ◽

Data Rates ◽

Huge Challenge

In IoT, data management is a big problem due to the connectivity of billions of devices, objects, processes generating big data. Since the Things are not following any specific (common) standard, so analysis of such data becomes a big challenge. There is a need to elaborate about the characteristics of IoT based data to find out the available and applicable solutions. Such kind of study also directs to realize the need of new techniques to cope up with such challenges. Due to the heterogeneity of connected nodes, different data rates and formats it is getting a huge challenge to deal with such variety of data. As IoT is providing processing nodes in quantity in form of smart nodes, it is presenting itself a good platform for big data analysis. In this chapter, characteristics of big data and requirements for big data analysis are highlighted. Considering the big source of data generation as well as the plausible suitable platform of such huge data analysis, the associated challenges are also underlined.

Download Full-text

Big Data Analysis and Implementation in Different Areas Using IoT

Securing the Internet of Things ◽

10.4018/978-1-5225-9866-4.ch049 ◽

2020 ◽

pp. 1096-1111

Author(s):

Aqeel ur Rehman ◽

Muhammad Fahad ◽

Rafi Ullah ◽

Faisal Abdullah

Keyword(s):

Data Mining ◽

Big Data ◽

Data Analysis ◽

Big Data Analysis ◽

Healthcare Applications ◽

Standard Data ◽

Huge Data ◽

Data Rates ◽

Definition Of ◽

Huge Challenge

This article describes how in IoT, data management is a major issue because of communication among billions of electronic devices, which generate the huge dataset. Due to the unavailability of any standard, data analysis on such a large amount of data is a complex task. There should be a definition of IoT-based data to find out what is available and its applicable solutions. Such a study also directs the need for new techniques to cope up with such challenges. Due to the heterogeneity of connected nodes, different data rates, and formats, it is a huge challenge to deal with such a variety of data. As IoT is providing processing nodes in the form of smart nodes; it is presenting a good platform to support the big data study. In this article, the characteristics of data mining requirements for data mining analysis are highlighted. The associated challenges of facts generation, as well as the plausible suitable platform of such huge data analysis is also underlined. The application of IoT to support big data analysis in healthcare applications is also presented.

Download Full-text

Source-Finding for the Australian Square Kilometre Array Pathfinder

Publications of the Astronomical Society of Australia ◽

10.1071/as12028 ◽

2012 ◽

Vol 29 (3) ◽

pp. 371-381 ◽

Cited By ~ 22

Author(s):

M. Whiting ◽

B. Humphreys

Keyword(s):

High Performance ◽

Distributed Processing ◽

Square Kilometre Array ◽

Data Rates ◽

Different Types ◽

Automated Processing ◽

Pipeline Processing ◽

Science Community ◽

Line Imaging ◽

Performance Computing

AbstractThe Australian Square Kilometre Array Pathfinder (ASKAP) presents a number of challenges in the area of source finding and cataloguing. The data rates and image sizes are very large, and require automated processing in a high-performance computing environment. This requires development of new tools, that are able to operate in such an environment and can reliably handle large datasets. These tools must also be able to accommodate the different types of observations ASKAP will make: continuum imaging, spectral-line imaging, transient imaging. The ASKAP project has developed a source-finder known as selavy, built upon the duchamp source-finder. selavy incorporates a number of new features, which we describe here.Since distributed processing of large images and cubes will be essential, we describe the algorithms used to distribute the data, find an appropriate threshold and search to that threshold and form the final source catalogue. We describe the algorithm used to define a varying threshold that responds to the local, rather than global, noise conditions, and provide examples of its use. And we discuss the approach used to apply two-dimensional fits to detected sources, enabling more accurate parameterisation. These new features are compared for timing performance, where we show that their impact on the pipeline processing will be small, providing room for enhanced algorithms.We also discuss the development process for ASKAP source finding software. By the time of ASKAP operations, the ASKAP science community, through the Survey Science Projects, will have contributed important elements of the source finding pipeline, and the mechanisms in which this will be done are presented.

Download Full-text

Big Data Analysis in IoT

The Internet of Things ◽

10.4018/978-1-5225-1832-7.ch018 ◽

2017 ◽

pp. 383-397

Author(s):

Aqeel-ur Rehman ◽

Rafi Ullah ◽

Faisal Abdullah

Keyword(s):

Big Data ◽

Data Analysis ◽

Data Management ◽

Big Data Analysis ◽

Data Generation ◽

New Techniques ◽

Huge Data ◽

Data Rates ◽

Huge Challenge

In IoT, data management is a big problem due to the connectivity of billions of devices, objects, processes generating big data. Since the Things are not following any specific (common) standard, so analysis of such data becomes a big challenge. There is a need to elaborate about the characteristics of IoT based data to find out the available and applicable solutions. Such kind of study also directs to realize the need of new techniques to cope up with such challenges. Due to the heterogeneity of connected nodes, different data rates and formats it is getting a huge challenge to deal with such variety of data. As IoT is providing processing nodes in quantity in form of smart nodes, it is presenting itself a good platform for big data analysis. In this chapter, characteristics of big data and requirements for big data analysis are highlighted. Considering the big source of data generation as well as the plausible suitable platform of such huge data analysis, the associated challenges are also underlined.

Download Full-text