rasdaman: Array Databases Boost Spatio-Temporal Analytics

Array databases: concepts, standards, implementations

Journal Of Big Data ◽

10.1186/s40537-020-00399-2 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Peter Baumann ◽

Dimitar Misev ◽

Vlad Merticariu ◽

Bang Pham Huu

Keyword(s):

Service Quality ◽

Ad Hoc ◽

Query Language ◽

Distributed Processing ◽

Database Systems ◽

Database Technology ◽

Comprehensive Survey ◽

Spatio Temporal ◽

And Performance ◽

Array Databases

AbstractMulti-dimensional arrays (also known as raster data or gridded data) play a key role in many, if not all science and engineering domains where they typically represent spatio-temporal sensor, image, simulation output, or statistics “datacubes”. As classic database technology does not support arrays adequately, such data today are maintained mostly in silo solutions, with architectures that tend to erode and not keep up with the increasing requirements on performance and service quality. Array Database systems attempt to close this gap by providing declarative query support for flexible ad-hoc analytics on large n-D arrays, similar to what SQL offers on set-oriented data, XQuery on hierarchical data, and SPARQL and CIPHER on graph data. Today, Petascale Array Database installations exist, employing massive parallelism and distributed processing. Hence, questions arise about technology and standards available, usability, and overall maturity. Several papers have compared models and formalisms, and benchmarks have been undertaken as well, typically comparing two systems against each other. While each of these represent valuable research to the best of our knowledge there is no comprehensive survey combining model, query language, architecture, and practical usability, and performance aspects. The size of this comparison differentiates our study as well with 19 systems compared, four benchmarked to an extent and depth clearly exceeding previous papers in the field; for example, subsetting tests were designed in a way that systems cannot be tuned to specifically these queries. It is hoped that this gives a representative overview to all who want to immerse into the field as well as a clear guidance to those who need to choose the best suited datacube tool for their application. This article presents results of the Research Data Alliance (RDA) Array Database Assessment Working Group (ADA:WG), a subgroup of the Big Data Interest Group. It has elicited the state of the art in Array Databases, technically supported by IEEE GRSS and CODATA Germany, to answer the question: how can data scientists and engineers benefit from Array Database technology? As it turns out, Array Databases can offer significant advantages in terms of flexibility, functionality, extensibility, as well as performance and scalability—in total, the database approach of offering “datacubes” analysis-ready heralds a new level of service quality. Investigation shows that there is a lively ecosystem of technology with increasing uptake, and proven array analytics standards are in place. Consequently, such approaches have to be considered a serious option for datacube services in science, engineering and beyond. Tools, though, vary greatly in functionality and performance as it turns out.

Download Full-text

United in Variety: The EarthServer Datacube Federation

10.5194/egusphere-egu2020-10849 ◽

2020 ◽

Author(s):

Peter Baumann

Keyword(s):

Data Center ◽

Data Centers ◽

Distributed Data ◽

Temporal Data ◽

Atmospheric Data ◽

Elevation Data ◽

Technology Services ◽

Spatio Temporal ◽

Data Location ◽

Array Databases

Datacubes form an accepted cornerstone for analysis (and visualization) ready spatio-temporal data offerings. Beyond the multi-dimensional data structure, the paradigm also suggests rich services, abstracting away from the untractable zillions of files and products - actionable datacubes as established by Array Databases enable users to ask "any query, any time" without programming. The principle of location-transparent federations establishes a single, coherent information space.The EarthServer federation is a large, growing data center network offering Petabytes of a critical variety, such as radar and optical satellite data, atmospheric data, elevation data, and thematic cubes like global sea ice. Around CODE-DE and DIASs an ecosystem of data has been established that is available to users as a single pool, in particular for efficient distributed data fusion irrespective of data location.In our talk we present technology, services, and governance of this unique intercontinental line-up of data centers. A live demo will show dist ributed datacube fusion.&#160;

Download Full-text

Managing Sparse Spatio-Temporal Data in SAVIME: an Evaluation of the Ph-tree Index

10.5753/sbbd.2021.17895 ◽

2021 ◽

Author(s):

Stiw Herrera ◽

Larissa Miguez da Silva ◽

Paulo Ricardo Reis ◽

Anderson Silva ◽

Fabio Porto

Keyword(s):

Sparse Data ◽

Scientific Data ◽

Efficient Implementation ◽

Temporal Data ◽

Indexing Structure ◽

Tree Index ◽

Spatio Temporal ◽

Data Ingestion ◽

Memory Indexing ◽

Array Databases

Scientific data is mainly multidimensional in its nature, presenting interesting opportunities for optimizations when managed by array databases. However, in scenarios where data is sparse, an efficient implementation is still required. In this paper, we investigate the adoption of the Ph-tree as an in-memory indexing structure for sparse data. We compare the performance in data ingestion and in both range and punctual queries, using SAVIME as the multidimensional array DBMS. Our experiments, using a real weather dataset, highlights the challenges involving providing a fast data ingestion, as proposed by SAVIME, and at the same time efficiently answering multidimensional queries on sparse data.

Download Full-text

E3 ubiquitin ligases

Essays in Biochemistry ◽

10.1042/bse0410015 ◽

2005 ◽

Vol 41 ◽

pp. 15-30 ◽

Cited By ~ 123

Author(s):

Helen C. Ardley ◽

Philip A. Robinson

Keyword(s):

Protein Complexes ◽

Loss Of Function ◽

Direct Role ◽

Cellular Processes ◽

Substrate Protein ◽

Protein Ubiquitination ◽

C Terminus ◽

Full Complement ◽

Spatio Temporal ◽

Eukaryotic Organisms

The selectivity of the ubiquitin–26 S proteasome system (UPS) for a particular substrate protein relies on the interaction between a ubiquitin-conjugating enzyme (E2, of which a cell contains relatively few) and a ubiquitin–protein ligase (E3, of which there are possibly hundreds). Post-translational modifications of the protein substrate, such as phosphorylation or hydroxylation, are often required prior to its selection. In this way, the precise spatio-temporal targeting and degradation of a given substrate can be achieved. The E3s are a large, diverse group of proteins, characterized by one of several defining motifs. These include a HECT (homologous to E6-associated protein C-terminus), RING (really interesting new gene) or U-box (a modified RING motif without the full complement of Zn2+-binding ligands) domain. Whereas HECT E3s have a direct role in catalysis during ubiquitination, RING and U-box E3s facilitate protein ubiquitination. These latter two E3 types act as adaptor-like molecules. They bring an E2 and a substrate into sufficiently close proximity to promote the substrate's ubiquitination. Although many RING-type E3s, such as MDM2 (murine double minute clone 2 oncoprotein) and c-Cbl, can apparently act alone, others are found as components of much larger multi-protein complexes, such as the anaphase-promoting complex. Taken together, these multifaceted properties and interactions enable E3s to provide a powerful, and specific, mechanism for protein clearance within all cells of eukaryotic organisms. The importance of E3s is highlighted by the number of normal cellular processes they regulate, and the number of diseases associated with their loss of function or inappropriate targeting.

Download Full-text

Elucidating cyclic AMP signaling in subcellular domains with optogenetic tools and fluorescent biosensors

Biochemical Society Transactions ◽

10.1042/bst20190246 ◽

2019 ◽

Vol 47 (6) ◽

pp. 1733-1747 ◽

Cited By ~ 3

Author(s):

Christina Klausen ◽

Fabian Kaiser ◽

Birthe Stüven ◽

Jan N. Hansen ◽

Dagmar Wachten

Keyword(s):

Cyclic Amp ◽

Adenosine Monophosphate ◽

Adenylyl Cyclases ◽

Temporal Precision ◽

Fluorescent Biosensors ◽

Subcellular Domains ◽

Spatio Temporal ◽

Hydrolysis Of ◽

Shed Light ◽

Cyclic Amp Signaling

The second messenger 3′,5′-cyclic nucleoside adenosine monophosphate (cAMP) plays a key role in signal transduction across prokaryotes and eukaryotes. Cyclic AMP signaling is compartmentalized into microdomains to fulfil specific functions. To define the function of cAMP within these microdomains, signaling needs to be analyzed with spatio-temporal precision. To this end, optogenetic approaches and genetically encoded fluorescent biosensors are particularly well suited. Synthesis and hydrolysis of cAMP can be directly manipulated by photoactivated adenylyl cyclases (PACs) and light-regulated phosphodiesterases (PDEs), respectively. In addition, many biosensors have been designed to spatially and temporarily resolve cAMP dynamics in the cell. This review provides an overview about optogenetic tools and biosensors to shed light on the subcellular organization of cAMP signaling.

Download Full-text