scholarly journals Four lectures on probabilistic methods for data science

Author(s):  
Roman Vershynin
Author(s):  
Greg Lawrance ◽  
Raphael Parra Hernandez ◽  
Khalegh Mamakani ◽  
Suraiya Khan ◽  
Brent Hills ◽  
...  

IntroductionLigo is an open source application that provides a framework for managing and executing administrative data linking projects. Ligo provides an easy-to-use web interface that lets analysts select among data linking methods including deterministic, probabilistic and machine learning approaches and use these in a documented, repeatable, tested, step-by-step process. Objectives and ApproachThe linking application has two primary functions: identifying common entities in datasets [de-duplication] and identifying common entities between datasets [linking]. The application is being built from the ground up in a partnership between the Province of British Columbia’s Data Innovation (DI) Program and Population Data BC, and with input from data scientists. The simple web interface allows analysts to streamline the processing of multiple datasets in a straight-forward and reproducible manner. ResultsBuilt in Python and implemented as a desktop-capable and cloud-deployable containerized application, Ligo includes many of the latest data-linking comparison algorithms with a plugin architecture that supports the simple addition of new formulae. Currently, deterministic approaches to linking have been implemented and probabilistic methods are in alpha testing. A fully functional alpha, including deterministic and probabilistic methods is expected to be ready in September, with a machine learning extension expected soon after. Conclusion/ImplicationsLigo has been designed with enterprise users in mind. The application is intended to make the processes of data de-duplication and linking simple, fast and reproducible. By making the application open source, we encourage feedback and collaboration from across the population research and data science community.


Author(s):  
Charles Bouveyron ◽  
Gilles Celeux ◽  
T. Brendan Murphy ◽  
Adrian E. Raftery

Author(s):  
Shaveta Bhatia

 The epoch of the big data presents many opportunities for the development in the range of data science, biomedical research cyber security, and cloud computing. Nowadays the big data gained popularity.  It also invites many provocations and upshot in the security and privacy of the big data. There are various type of threats, attacks such as leakage of data, the third party tries to access, viruses and vulnerability that stand against the security of the big data. This paper will discuss about the security threats and their approximate method in the field of biomedical research, cyber security and cloud computing.


Author(s):  
Natalia V. Vysotskaya ◽  
T. V. Kyrbatskaya

The article is devoted to the consideration of the main directions of digital transformation of the transport industry in Russia. It is proposed in the process of digital transformation to integrate the community approach into the company's business model using blockchain technology and methods and results of data science; complement the new digital culture with a digital team and new communities that help management solve business problems; focus the attention of the company's management on its employees and develop those competencies in them that robots and artificial intelligence systems cannot implement: develop algorithmic, computable and non-linear thinking in all employees of the company.


2019 ◽  
Vol 5 (30) ◽  
pp. 960-968
Author(s):  
Güner Gözde KILIÇ
Keyword(s):  

2020 ◽  
Author(s):  
Saeed Nosratabadi ◽  
Amir Mosavi ◽  
Puhong Duan ◽  
Pedram Ghamisi ◽  
Ferdinand Filip ◽  
...  

This paper provides a state-of-the-art investigation of advances in data science in emerging economic applications. The analysis was performed on novel data science methods in four individual classes of deep learning models, hybrid deep learning models, hybrid machine learning, and ensemble models. Application domains include a wide and diverse range of economics research from the stock market, marketing, and e-commerce to corporate banking and cryptocurrency. Prisma method, a systematic literature review methodology, was used to ensure the quality of the survey. The findings reveal that the trends follow the advancement of hybrid models, which, based on the accuracy metric, outperform other learning algorithms. It is further expected that the trends will converge toward the advancements of sophisticated hybrid deep learning models.


Sign in / Sign up

Export Citation Format

Share Document