TagNN: A Code Tag Generation Technology for Resource Retrieval from Open-Source Big Data

Wireless Communications and Mobile Computing ◽

10.1155/2021/9956207 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Lingbin Zeng ◽

Xin Guo ◽

Cheng Yang ◽

Yao Lu ◽

Xiao Li

Keyword(s):

Big Data ◽

Open Source ◽

Open Source Software ◽

Learning Algorithm ◽

Difficult Problem ◽

Empirical Knowledge ◽

Huge Number ◽

Source Codes ◽

Deep Learning Algorithm ◽

Generation Technology

With the vigorous development of open-source software, a huge number of open-source projects and open-source codes have been accumulated in open-source big data, which contains a wealth of code resources. However, effectively and efficiently retrieving the relevant code snippets in such a large amount of open-source big data is an extremely difficult problem. There are usually large gaps between the user’s natural language description and the open-source code snippets. In this paper, we propose a novel code tag generation and code retrieval approach named TagNN, which combines software engineering empirical knowledge and a deep learning algorithm. The experimental results show that our method has good effects on code tag generation and code snippet retrieval.

Download Full-text

Research on the Design of Government Affairs Platform in the Context of Big Data

Scientific Programming ◽

10.1155/2021/9936217 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Qian Huang ◽

Xue Wen Li

Keyword(s):

Big Data ◽

Deep Learning ◽

Learning Algorithm ◽

The Internet ◽

Chinese Government ◽

Development Status ◽

Deep Learning Algorithm ◽

Government Management ◽

The Government ◽

The Impact

Big data is a massive and diverse form of unstructured data, which needs proper analysis and management. It is another great technological revolution after the Internet, the Internet of Things, and cloud computing. This paper firstly studies the related concepts and basic theories as the origin of research. Secondly, it analyzes in depth the problems and challenges faced by Chinese government management under the impact of big data. Again, we explore the opportunities that big data brings to government management in terms of management efficiency, administrative capacity, and public services and believe that governments should seize opportunities to make changes. Brainlike computing attempts to simulate the structure and information processing process of biological neural network. This paper firstly analyzes the development status of e-government at home and abroad, studies the service-oriented architecture (SOA) and web services technology, deeply studies the e-government and SOA theory, and discusses this based on the development status of e-government in a certain region. Then, the deep learning algorithm is used to construct the monitoring platform to monitor the government behavior in real time, and the deep learning algorithm is used to conduct in-depth mining to analyze the government's intention behavior.

Download Full-text

Industrial Big Data Platform Based on Open Source Software

Proceedings of the International Conference on Computer Networks and Communication Technology (CNCT 2016) ◽

10.2991/cnct-16.2017.90 ◽

2017 ◽

Cited By ~ 1

Author(s):

Wen YANG ◽

Syed Naeem Haider ◽

Jian-hong ZOU ◽

Qian-chuan ZHAO

Keyword(s):

Big Data ◽

Open Source ◽

Open Source Software ◽

Industrial Big Data ◽

Data Platform

Download Full-text

Computing remote sensing big data using local hardware and open-source software packages

Kart og plan ◽

10.18261/issn.2535-6003-2021-03-04-09 ◽

2021 ◽

Vol 114 (3-04) ◽

pp. 254-273

Author(s):

Misganu Debella-Gilo ◽

Jonathan Rizzi

Keyword(s):

Remote Sensing ◽

Big Data ◽

Open Source ◽

Open Source Software ◽

Software Packages

Download Full-text

Conquery: an Open Source Application to analyze High Content Healthcare Data (Preprint)

10.2196/preprints.32745 ◽

2021 ◽

Author(s):

Fabian Kovacs ◽

Max Thonagel ◽

Marion Ludwig ◽

Alexander Albrecht ◽

Manuel Hegner ◽

...

Keyword(s):

Decision Making ◽

Big Data ◽

Data Analysis ◽

Open Source ◽

Open Source Software ◽

Medical Records ◽

Healthcare Sector ◽

Study Cohort ◽

Decision Making Processes ◽

Analytical Approaches

BACKGROUND Big data in healthcare must be exploited to achieve a substantial increase in efficiency and competitiveness. Especially the analysis of patient-related data possesses huge potential to improve decision-making processes. However, most analytical approaches used today are highly time- and resource-consuming. OBJECTIVE The presented software solution Conquery is an open-source software tool providing advanced, but intuitive data analysis without the need for specialized statistical training. Conquery aims to simplify big data analysis for novice database users in the medical sector. METHODS Conquery is a document-oriented distributed timeseries database and analysis platform. Its main application is the analysis of per-person medical records by non-technical medical professionals. Complex analyses are realized in the Conquery frontend by dragging tree nodes into the query editor. Queries are evaluated by a bespoke distributed query-engine for medical records in a column-oriented fashion. We present a custom compression scheme to facilitate low response times that uses online calculated as well as precomputed metadata and data statistics. RESULTS Conquery allows for easy navigation through the hierarchy and enables complex study cohort construction whilst reducing the demand on time and resources. The UI of Conquery and a query output is exemplified by the construction of a relevant clinical cohort. CONCLUSIONS Conquery is an efficient and intuitive open-source software for performant and secure data analysis and aims at supporting decision-making processes in the healthcare sector.

Download Full-text

Optimization Scenarios for Open Source Software Used in E-Learning Activities

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Optimizing Contemporary Application and Processes in Open Source Software ◽

10.4018/978-1-5225-5314-4.ch005 ◽

2018 ◽

pp. 102-123

Author(s):

Utku Köse

Keyword(s):

Open Source ◽

Open Source Software ◽

Learning Experience ◽

Learning Activities ◽

Machine Learning Techniques ◽

Software Systems ◽

Source Codes ◽

E Learning ◽

Additional Costs ◽

Teaching Learning

Using open software in e-learning application is one of the most popular ways of improving effectiveness of e-learning-based processes without thinking about additional costs and even focusing on modifying the software according to needs. Because of that, it is important to have an idea about what is needed while using an e-learning-oriented open software system and how to deal with its source codes. At this point, it is a good option to add some additional features and functions to make the open source software more intelligent and practical to make both teaching-learning experiences during e-learning processes. In this context, the objective of this chapter is to discuss some possible applications of artificial intelligence to include optimization processes within open source software systems used in e-learning activities. In detail, the chapter focuses more on using swarm intelligence and machine learning techniques for this aim and expresses some theoretical views for improving the effectiveness of such software for a better e-learning experience.

Download Full-text

BIG DATA

International Journal of Engineering Technologies and Management Research ◽

10.29121/ijetmr.v5.i2.2018.606 ◽

2020 ◽

Vol 5 (2) ◽

pp. 9-12

Author(s):

Abhishek Dubey

Keyword(s):

Big Data ◽

Open Source ◽

High Speed ◽

Scale Up ◽

Huge Number ◽

Center Stage ◽

Information Sets ◽

High Level

The term 'Big Data' portrays inventive methods and advances to catch, store, disseminate, oversee and break down petabyte-or bigger estimated sets of data with high-speed & diverted structures. Enormous information can be organized, non-structured or half-organized, bringing about inadequacy of routine information administration techniques. Information is produced from different distinctive sources and can touch base in the framework at different rates. With a specific end goal to handle this lot of information in an economical and proficient way, parallelism is utilized. Big Data is information whose scale, differences, and unpredictability require new engineering, methods, calculations, and investigation to oversee it and concentrate esteem and concealed learning from it. Hadoop is the center stage for organizing Big Data, and takes care of the issue of making it valuable for examination purposes. Hadoop is an open source programming venture that empowers the dispersed handling of huge information sets crosswise over bunches of ware servers. It is intended to scale up from a solitary server to a huge number of machines, with a high level of adaptation to non-critical failure.

Download Full-text

Detecting The Speaker Language Using CNN Deep Learning Algorithm

Iraqi Journal for Computer Science and Mathematics ◽

10.52866/ijcsm.2022.01.01.005 ◽

2022 ◽

pp. 43-52

Author(s):

Fawziya M. Rammo ◽

Mohammed N. Al-Hamdani

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Open Source ◽

Convolutional Neural Networks ◽

Learning Algorithm ◽

Language Models ◽

Mel Frequency Cepstral Coefficients ◽

Deep Learning Algorithm ◽

Time Frames

Many languages identification (LID) systems rely on language models that use machine learning (ML) approaches, LID systems utilize rather long recording periods to achieve satisfactory accuracy. This study aims to extract enough information from short recording intervals in order to successfully classify the spoken languages under test. The classification process is based on frames of (2-18) seconds where most of the previous LID systems were based on much longer time frames (from 3 seconds to 2 minutes). This research defined and implemented many low-level features using MFCC (Mel-frequency cepstral coefficients), containing speech files in five languages (English. French, German, Italian, Spanish), from voxforge.org an open-source corpus that consists of user-submitted audio clips in various languages, is the source of data used in this paper. A CNN (convolutional Neural Networks) algorithm applied in this paper for classification and the result was perfect, binary language classiﬁcation had an accuracy of 100%, and five languages classiﬁcation with six languages had an accuracy of 99.8%.

Download Full-text

Big data processing using Open Source Software- A Questionnaire on the data science

Scholedge International Journal of Multidisciplinary & Allied Studies ISSN 2394-336X ◽

10.19085/journal.sijmas030101 ◽

2016 ◽

Vol 3 (1) ◽

pp. 1

Author(s):

Andrew McCullum

Keyword(s):

Big Data ◽

Data Processing ◽

World Trade Organization ◽

Central Asia ◽

Open Source ◽

Open Source Software ◽

World Trade ◽

Data Science ◽

Customs Union ◽

The World

In 2015, Central Asia made some vital enhancements in nature for cross-fringe e-business: Kazakhstan's promotion to the World Trade Organization (WTO) will help business straightforwardness, while the Kyrgyz Republic's enrollment in the Eurasian Customs Union grows its buyer base. Why e-business? Two reasons to begin with, e-trade diminishes the expense of separation. Focal Asia is the most elevated exchange cost locale on the planet: unlimited separations from real markets make discovering purchasers testing, shipping merchandise moderate, and fare costs high. Second, e-business can pull in populaces that are customarily under-spoke to in fare markets, for example, ladies, little organizations and rustic business visionaries.

Download Full-text

Deep learning algorithm and location big data mining

Proceedings of the 2015 4th International Conference on Computer, Mechatronics, Control and Electronic Engineering ◽

10.2991/iccmcee-15.2015.167 ◽

2015 ◽

Author(s):

FaQin Gao

Keyword(s):

Data Mining ◽

Big Data ◽

Deep Learning ◽

Learning Algorithm ◽

Big Data Mining ◽

Deep Learning Algorithm

Download Full-text

What Is Open Source Software (OSS) and What Is Big Data?

Research Anthology on Usage and Development of Open Source Software ◽

10.4018/978-1-7998-9158-1.ch041 ◽

2021 ◽

pp. 817-857

Author(s):

Richard S. Segall

Keyword(s):

Big Data ◽

Open Source ◽

Open Source Software ◽

Fog Computing ◽

Computer Software ◽

Data Sets ◽

Stream Data ◽

Big Data Visualization ◽

Continuous Stream

This chapter discusses what Open Source Software is and its relationship to Big Data and how it differs from other types of software and its software development cycle. Open source software (OSS) is a type of computer software in which source code is released under a license in which the copyright holder grants users the rights to study, change, and distribute the software to anyone and for any purpose. Big Data are data sets that are so voluminous and complex that traditional data processing application software are inadequate to deal with them. Big data can be discrete or a continuous stream data and is accessible using many types of computing devices ranging from supercomputers and personal workstations to mobile devices and tablets. It is discussed how fog computing can be performed with cloud computing for visualization of Big Data. This chapter also presents a summary of additional web-based Big Data visualization software.

Download Full-text