Compressed domain-specific data processing and analysis

Abstract Background GPS-based cycling data are increasingly available for traffic planning these days. However, the recorded data often contain more information than simply bicycle trips. GPS tracks resulting from tracking while using other modes of transport than bike or long periods at working locations while people are still tracking are only some examples. Thus, collected bicycle GPS data need to be processed adequately to use them for transportation planning. Results The article presents a multi-level approach towards bicycle-specific data processing. The data processing model contains different steps of processing (data filtering, smoothing, trip segmentation, transport mode recognition, driving mode detection) to finally obtain a correct data set that contains bicycle trips, only. The validation reveals a sound accuracy of the model at its’ current state (82–88%).

Download Full-text

A Meta-Mining Ontology Framework for Data Processing

International Journal of Embedded and Real-Time Communication Systems ◽

10.4018/ijertcs.2021040103 ◽

2021 ◽

Vol 12 (2) ◽

pp. 37-56

Author(s):

Man Tianxing ◽

Nataly Zhukova ◽

Alexander Vodyaho ◽

Tin Tun Aung

Keyword(s):

Data Mining ◽

Data Processing ◽

Data Streams ◽

Specific Data ◽

Ontology Merging ◽

Knowledge Models ◽

Task Requirements ◽

Ontology Extraction ◽

Merging Method ◽

And Task

Extracting knowledge from data streams received from observed objects through data mining is required in various domains. However, there is a lack of any kind of guidance on which techniques can or should be used in which contexts. Meta mining technology can help build processes of data processing based on knowledge models taking into account the specific features of the objects. This paper proposes a meta mining ontology framework that allows selecting algorithms for solving specific data mining tasks and build suitable processes. The proposed ontology is constructed using existing ontologies and is extended with an ontology of data characteristics and task requirements. Different from the existing ontologies, the proposed ontology describes the overall data mining process, used to build data processing processes in various domains, and has low computational complexity compared to others. The authors developed an ontology merging method and a sub-ontology extraction method, which are implemented based on OWL API via extracting and integrating the relevant axioms.

Download Full-text

DISSECT: DISentangle SharablE ConTent for Multimodal Integration and Crosswise-mapping

10.1101/2020.09.04.283234 ◽

2020 ◽

Author(s):

Geoffrey Schau ◽

Erik Burlingame ◽

Young Hwan Chang

Keyword(s):

Deep Learning ◽

Complete Information ◽

Specific Information ◽

Multimodal Integration ◽

Specific Data ◽

Domain Specific ◽

Cross Domain ◽

Input Feature ◽

Novel Approach ◽

Latent Representations

AbstractDeep learning systems have emerged as powerful mechanisms for learning domain translation models. However, in many cases, complete information in one domain is assumed to be necessary for sufficient cross-domain prediction. In this work, we motivate a formal justification for domain-specific information separation in a simple linear case and illustrate that a self-supervised approach enables domain translation between data domains while filtering out domain-specific data features. We introduce a novel approach to identify domainspecific information from sets of unpaired measurements in complementary data domains by considering a deep learning cross-domain autoencoder architecture designed to learn shared latent representations of data while enabling domain translation. We introduce an orthogonal gate block designed to enforce orthogonality of input feature sets by explicitly removing non-sharable information specific to each domain and illustrate separability of domain-specific information on a toy dataset.

Download Full-text

Effect of bodyside-specific data processing on the results of fish morphometric studies

Fundamental and Applied Limnology / Archiv für Hydrobiologie ◽

10.1127/fal/2018/1159 ◽

2018 ◽

Author(s):

Péter Takács ◽

Árpád Ferincz ◽

Ádám Staszny ◽

Zoltán Vitál

Keyword(s):

Data Processing ◽

Specific Data

Download Full-text

The Berlin Big Data Center (BBDC)

it - Information Technology ◽

10.1515/itit-2018-0016 ◽

2018 ◽

Vol 60 (5-6) ◽

pp. 321-326 ◽

Cited By ~ 1

Author(s):

Christoph Boden ◽

Tilmann Rabl ◽

Volker Markl

Keyword(s):

Big Data ◽

Data Analysis ◽

Data Processing ◽

Deep Understanding ◽

Automatic Parallelization ◽

Second Phase ◽

Distributed Data ◽

Domain Specific ◽

Distributed Data Processing ◽

Large Groups

Abstract The last decade has been characterized by the collection and availability of unprecedented amounts of data due to rapidly decreasing storage costs and the omnipresence of sensors and data-producing global online-services. In order to process and analyze this data deluge, novel distributed data processing systems resting on the paradigm of data flow such as Apache Hadoop, Apache Spark, or Apache Flink were built and have been scaled to tens of thousands of machines. However, writing efficient implementations of data analysis programs on these systems requires a deep understanding of systems programming, prohibiting large groups of data scientists and analysts from efficiently using this technology. In this article, we present some of the main achievements of the research carried out by the Berlin Big Data Cente (BBDC). We introduce the two domain-specific languages Emma and LARA, which are deeply embedded in Scala and enable declarative specification and the automatic parallelization of data analysis programs, the PEEL Framework for transparent and reproducible benchmark experiments of distributed data processing systems, approaches to foster the interpretability of machine learning models and finally provide an overview of the challenges to be addressed in the second phase of the BBDC.

Download Full-text

Domain-specific data mining for residents' transit pattern retrieval from incomplete information

Journal of Network and Computer Applications ◽

10.1016/j.jnca.2019.02.016 ◽

2019 ◽

Vol 134 ◽

pp. 62-71 ◽

Cited By ~ 2

Author(s):

Yongxin Liu ◽

Jianqiang Li ◽

Zhong Ming ◽

Houbing Song ◽

Xiaoxiong Weng ◽

...

Keyword(s):

Data Mining ◽

Incomplete Information ◽

Specific Data ◽

Domain Specific ◽

Pattern Retrieval

Download Full-text

A Domain Specific Data Management Architecture for Protein Structure Data

2006 International Conference of the IEEE Engineering in Medicine and Biology Society ◽

10.1109/iembs.2006.259892 ◽

2006 ◽

Cited By ~ 7

Author(s):

Yanchao Wang ◽

Rajshekhar Sunderraman ◽

Hao Tian

Keyword(s):

Protein Structure ◽

Data Management ◽

Structure Data ◽

Specific Data ◽

Domain Specific ◽

Management Architecture ◽

Protein Structure Data

Download Full-text

Building partnerships among social science researchers, institution‐based repositories and domain specific data archives

OCLC Systems & Services ◽

10.1108/10650750710720757 ◽

2007 ◽

Vol 23 (1) ◽

pp. 35-53 ◽

Cited By ~ 17

Author(s):

Ann G. Green ◽

Myron P. Gutmann

Keyword(s):

Social Science ◽

Specific Data ◽

Domain Specific ◽

Data Archives

Download Full-text

Semantic Information Extraction on Domain Specific Data Sheets

Lecture Notes in Computer Science - The Semantic Web: Trends and Challenges ◽

10.1007/978-3-319-07443-6_60 ◽

2014 ◽

pp. 864-873 ◽

Cited By ~ 3

Author(s):

Kai Barkschat

Keyword(s):

Information Extraction ◽

Semantic Information ◽

Specific Data ◽

Domain Specific

Download Full-text

Semi-supervised Wafer Map Pattern Recognition using Domain-Specific Data Augmentation and Contrastive Learning

10.1109/itc50571.2021.00019 ◽

2021 ◽

Author(s):

Hanbin Hu ◽

Chen He ◽

Peng Li

Keyword(s):

Pattern Recognition ◽

Data Augmentation ◽

Specific Data ◽

Domain Specific

Download Full-text