scholarly journals Decontextualizing Big Data for a Better Perception of Macroecology

2016 ◽  
Author(s):  
Christian Mulder ◽  
Giorgi Mancinelli

ABSTRACTFish species are charismatic subjects widely used for ecological assessment and modelling. We investigated the influence of electrofishing in an attempt to illuminate the extent to which datasets might be merged together. Three American Midwestern regions in Ohio were chosen as study area and the changes in the size-biomass spectra of more than 2000 fish assemblages were analysed. These communities behaved differently according to the sampling method in conjunction to the morphology of the investigated streams and rivers, as shown by decreasing predatory species and lowering of allometric slopes. There are here several lines of evidence indicating that the chosen sampling method, as determined by different habitats, acts as a pitfall and strongly influences the allometry of fish spectra. In the ongoing data-rich era, our results highlight that macroecological investigations, often performed by machine-learning systems without considering the different procedures adopted to collect original data, can easily produce artefactual allometric scalings.

2022 ◽  
pp. 1663-1702
Author(s):  
Ebru Aydindag Bayrak ◽  
Pinar Kirci

Intelligent big data analytics and machine learning systems have been introduced to explain for the early diagnosis of neurological disorders. A number of scholarly researches about intelligent big data analytics in healthcare and machine learning system used in the healthcare system have been mentioned. The authors have explained the definition of big data, big data samples, and big data analytics. But the main goal is helping researchers or specialists in providing opinion about diagnosing or predicting neurological disorders using intelligent big data analytics and machine learning. Therefore, they focused on the healthcare systems using these innovative ways in particular. The information of platform and tools about big data analytics in healthcare is investigated. Numerous academic studies based on the detection of neurological disorders using both machine learning methods and big data analytics have been reviewed.


Author(s):  
Cate Dowd

Semantic news tags processed via cloud servers are in amongst big data and machine learning systems. The latter may have influenced Murdoch’s acquisition of a ‘social media news agency’, and other partnerships, as a mix of new roles across journalism, analytics, and search emerged. Some editing roles in journalism focus on SEO, but Murdoch’s Storyful, which started as a verification business created jobs for cloud operations engineers, viral video editors, and trends editors. Data-mining techniques were a lure for news and social media partnerships circa 2013–2016. In the name of verification, access to big data was matched by social media gaining credibility, evident in Facebook Newswire and other journalism projects. Deep learning methods in search, referrals, and automated tagging have also produced mutual benefits, mostly via third party agreements. However, data sharing for political ends by targeting particular users, and verification projects, have not stopped fake news.


Author(s):  
Ebru Aydindag Bayrak ◽  
Pinar Kirci

Intelligent big data analytics and machine learning systems have been introduced to explain for the early diagnosis of neurological disorders. A number of scholarly researches about intelligent big data analytics in healthcare and machine learning system used in the healthcare system have been mentioned. The authors have explained the definition of big data, big data samples, and big data analytics. But the main goal is helping researchers or specialists in providing opinion about diagnosing or predicting neurological disorders using intelligent big data analytics and machine learning. Therefore, they focused on the healthcare systems using these innovative ways in particular. The information of platform and tools about big data analytics in healthcare is investigated. Numerous academic studies based on the detection of neurological disorders using both machine learning methods and big data analytics have been reviewed.


2021 ◽  
Vol 9 (12) ◽  
pp. 1351
Author(s):  
Zhi Yung Tay ◽  
Januwar Hadi ◽  
Favian Chow ◽  
De Jin Loh ◽  
Dimitrios Konovessis

The global greenhouse gas emitted from shipping activities is one of the factors contributing to global warming; thus, there is an urgent need to mitigate the adverse effect of climate change. One of the key strategies is to build a vibrant maritime industry with the use of innovation and digital technologies as well as intelligent systems. The digitization of the shipping industry not only provides a competitive edge to the shipping business model but also enhances ship operational and energy efficiency. This review paper focuses on the big data analytics and machine learning applied to harbour craft vessels with the aim to achieve fuel efficiency. The paper reviews the telemetry system requires for the digitalization of harbour craft vessels, its challenges in installation, the vessel monitoring and data transmission system. The commonly used methods for data cleaning are also presented. Last but not least, the paper considers two types of the machine learning systems, i.e., supervised and unsupervised machine learning systems. The multi-linear regression and hidden Markov model for supervised machine learning system and the artificial neural network, grey box model and long short-term memory model for unsupervised machine learning are discussed, and their pros and cons are presented.


2018 ◽  
Vol 46 (3) ◽  
pp. 147-160 ◽  
Author(s):  
Laouni Djafri ◽  
Djamel Amar Bensaber ◽  
Reda Adjoudj

Purpose This paper aims to solve the problems of big data analytics for prediction including volume, veracity and velocity by improving the prediction result to an acceptable level and in the shortest possible time. Design/methodology/approach This paper is divided into two parts. The first one is to improve the result of the prediction. In this part, two ideas are proposed: the double pruning enhanced random forest algorithm and extracting a shared learning base from the stratified random sampling method to obtain a representative learning base of all original data. The second part proposes to design a distributed architecture supported by new technologies solutions, which in turn works in a coherent and efficient way with the sampling strategy under the supervision of the Map-Reduce algorithm. Findings The representative learning base obtained by the integration of two learning bases, the partial base and the shared base, presents an excellent representation of the original data set and gives very good results of the Big Data predictive analytics. Furthermore, these results were supported by the improved random forests supervised learning method, which played a key role in this context. Originality/value All companies are concerned, especially those with large amounts of information and want to screen them to improve their knowledge for the customer and optimize their campaigns.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Laouni Djafri

PurposeThis work can be used as a building block in other settings such as GPU, Map-Reduce, Spark or any other. Also, DDPML can be deployed on other distributed systems such as P2P networks, clusters, clouds computing or other technologies.Design/methodology/approachIn the age of Big Data, all companies want to benefit from large amounts of data. These data can help them understand their internal and external environment and anticipate associated phenomena, as the data turn into knowledge that can be used for prediction later. Thus, this knowledge becomes a great asset in companies' hands. This is precisely the objective of data mining. But with the production of a large amount of data and knowledge at a faster pace, the authors are now talking about Big Data mining. For this reason, the authors’ proposed works mainly aim at solving the problem of volume, veracity, validity and velocity when classifying Big Data using distributed and parallel processing techniques. So, the problem that the authors are raising in this work is how the authors can make machine learning algorithms work in a distributed and parallel way at the same time without losing the accuracy of classification results. To solve this problem, the authors propose a system called Dynamic Distributed and Parallel Machine Learning (DDPML) algorithms. To build it, the authors divided their work into two parts. In the first, the authors propose a distributed architecture that is controlled by Map-Reduce algorithm which in turn depends on random sampling technique. So, the distributed architecture that the authors designed is specially directed to handle big data processing that operates in a coherent and efficient manner with the sampling strategy proposed in this work. This architecture also helps the authors to actually verify the classification results obtained using the representative learning base (RLB). In the second part, the authors have extracted the representative learning base by sampling at two levels using the stratified random sampling method. This sampling method is also applied to extract the shared learning base (SLB) and the partial learning base for the first level (PLBL1) and the partial learning base for the second level (PLBL2). The experimental results show the efficiency of our solution that the authors provided without significant loss of the classification results. Thus, in practical terms, the system DDPML is generally dedicated to big data mining processing, and works effectively in distributed systems with a simple structure, such as client-server networks.FindingsThe authors got very satisfactory classification results.Originality/valueDDPML system is specially designed to smoothly handle big data mining classification.


2018 ◽  
Vol 12 ◽  
pp. 85-98
Author(s):  
Bojan Kostadinov ◽  
Mile Jovanov ◽  
Emil STANKOV

Data collection and machine learning are changing the world. Whether it is medicine, sports or education, companies and institutions are investing a lot of time and money in systems that gather, process and analyse data. Likewise, to improve competitiveness, a lot of countries are making changes to their educational policy by supporting STEM disciplines. Therefore, it’s important to put effort into using various data sources to help students succeed in STEM. In this paper, we present a platform that can analyse student’s activity on various contest and e-learning systems, combine and process the data, and then present it in various ways that are easy to understand. This in turn enables teachers and organizers to recognize talented and hardworking students, identify issues, and/or motivate students to practice and work on areas where they’re weaker.


Sign in / Sign up

Export Citation Format

Share Document