A Graph Database Representation of Portuguese Criminal-Related Documents

Gonçalo Carnaz; Vitor Beires Nogueira; Mário Antunes

doi:10.3390/informatics8020037

A Graph Database Representation of Portuguese Criminal-Related Documents

Informatics ◽

10.3390/informatics8020037 ◽

2021 ◽

Vol 8 (2) ◽

pp. 37

Author(s):

Gonçalo Carnaz ◽

Vitor Beires Nogueira ◽

Mário Antunes

Keyword(s):

Information Extraction ◽

Name Entity Recognition ◽

Entity Recognition ◽

Automatic Extraction ◽

Graph Database ◽

Named Entities ◽

Vast Number ◽

Name Entity ◽

Manual Analysis ◽

F Measure

Organizations have been challenged by the need to process an increasing amount of data, both structured and unstructured, retrieved from heterogeneous sources. Criminal investigation police are among these organizations, as they have to manually process a vast number of criminal reports, news articles related to crimes, occurrence and evidence reports, and other unstructured documents. Automatic extraction and representation of data and knowledge in such documents is an essential task to reduce the manual analysis burden and to automate the discovering of names and entities relationships that may exist in a case. This paper presents SEMCrime, a framework used to extract and classify named-entities and relations in Portuguese criminal reports and documents, and represent the data retrieved into a graph database. A 5WH1 (Who, What, Why, Where, When, and How) information extraction method was applied, and a graph database representation was used to store and visualize the relations extracted from the documents. Promising results were obtained with a prototype developed to evaluate the framework, namely a name-entity recognition with an F-Measure of 0.73, and a 5W1H information extraction performance with an F-Measure of 0.65.

Download Full-text

ENCADEAr: ENCADEAmento automático de notícias

Oslo Studies in Language ◽

10.5617/osla.1457 ◽

2015 ◽

Vol 7 (1) ◽

Author(s):

Carla Abreu ◽

Jorge Teixeira ◽

Eugénio Oliveira

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Information Extraction ◽

Supervised Learning ◽

Language Processing ◽

Name Entity Recognition ◽

Entity Recognition ◽

Name Entity ◽

Supervised Learning Algorithms ◽

Processing Information

This work aims at defining and evaluating different techniques to automatically build temporal news sequences. The approach proposed is composed by three steps: (i) near duplicate documents detention; (ii) keywords extraction; (iii) news sequences creation. This approach is based on: Natural Language Processing, Information Extraction, Name Entity Recognition and supervised learning algorithms. The proposed methodology got a precision of 93.1% for news chains sequences creation.

Download Full-text

Developing Name Entity Recognition for Structured and Unstructured Text Formatting Dataset

2020 Fifth International Conference on Informatics and Computing (ICIC) ◽

10.1109/icic50835.2020.9288566 ◽

2020 ◽

Author(s):

Nadhia Salsabila Azzahra ◽

Muhammad Okky Ibrohim ◽

Junaedi Fahmi ◽

Bagus Fajar Apriyanto ◽

Oskar Riandi

Keyword(s):

Name Entity Recognition ◽

Entity Recognition ◽

Unstructured Text ◽

Name Entity

Download Full-text

Name Entity Recognition for Malay Texts Using Cross-Lingual Annotation Projection Approach

Computational Science and Its Applications -- ICCSA 2015 - Lecture Notes in Computer Science ◽

10.1007/978-3-319-21404-7_18 ◽

2015 ◽

pp. 242-256 ◽

Cited By ~ 1

Author(s):

Norshuhani Zamin ◽

Zainab Abu Bakar

Keyword(s):

Name Entity Recognition ◽

Entity Recognition ◽

Name Entity ◽

Projection Approach ◽

Cross Lingual

Download Full-text

Referent graph embedding model for name entity recognition of Chinese car reviews

Knowledge-Based Systems ◽

10.1016/j.knosys.2021.107558 ◽

2021 ◽

pp. 107558

Author(s):

Zhao Fang ◽

Qiang Zhang ◽

Stanley Kok ◽

Ling Li ◽

Anning Wang ◽

...

Keyword(s):

Graph Embedding ◽

Name Entity Recognition ◽

Entity Recognition ◽

Name Entity

Download Full-text

Exploiting the concept level feature for enhanced name entity recognition in Chinese EMRs

The Journal of Supercomputing ◽

10.1007/s11227-019-02917-3 ◽

2019 ◽

Vol 76 (8) ◽

pp. 6399-6420 ◽

Cited By ~ 3

Author(s):

Qing Zhao ◽

Dan Wang ◽

Jianqiang Li ◽

Faheem Akhtar

Keyword(s):

Name Entity Recognition ◽

Entity Recognition ◽

Name Entity

Download Full-text

Clinical Name Entity Recognition Based on Recurrent Neural Networks

2018 18th International Conference on Computational Science and Applications (ICCSA) ◽

10.1109/iccsa.2018.8439147 ◽

2018 ◽

Cited By ~ 1

Author(s):

Thoai Man Luu ◽

Robert Phan ◽

Rachel Davey ◽

Girija Chetty

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Name Entity Recognition ◽

Entity Recognition ◽

Name Entity

Download Full-text

Enhancing Clinical Name Entity Recognition Based on Hybrid Deep Learning Scheme

2019 International Conference on Data Mining Workshops (ICDMW) ◽

10.1109/icdmw.2019.00153 ◽

2019 ◽

Author(s):

Robert Phan ◽

Thoai Luu ◽

Rachel Davey ◽

Girija Chetty

Keyword(s):

Deep Learning ◽

Name Entity Recognition ◽

Entity Recognition ◽

Name Entity ◽

Learning Scheme

Download Full-text

A Two-Phase Approach for Stance Classification in Twitter Using Name Entity Recognition and Term Frequency Feature

2019 IEEE/ACIS 18th International Conference on Computer and Information Science (ICIS) ◽

10.1109/icis46139.2019.8940282 ◽

2019 ◽

Author(s):

Yin Min Tun ◽

Phyu Hninn Myint

Keyword(s):

Name Entity Recognition ◽

Entity Recognition ◽

Two Phase ◽

Term Frequency ◽

Phase Approach ◽

Name Entity ◽

Frequency Feature

Download Full-text

Accurate Name Entity Recognition for Biomedical Literatures: A Combined High-quality Manual Annotation and Deep-learning Natural Language Processing Study

10.1101/2021.09.15.460567 ◽

2021 ◽

Author(s):

Dao-Ling Huang ◽

Quanlei Zeng ◽

Yun Xiong ◽

Shuixia Liu ◽

Chaoqun Pang ◽

...

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Language Processing ◽

Name Entity Recognition ◽

Entity Recognition ◽

Manual Annotation ◽

Gene Variant ◽

High Quality ◽

Name Entity ◽

Entity Annotation

A combined high-quality manual annotation and deep-learning natural language processing study is reported to make accurate name entity recognition (NER) for biomedical literatures. A home-made version of entity annotation guidelines on biomedical literatures was constructed. Our manual annotations have an overall over 92% consistency for all the four entity types such as gene, variant, disease and species with the same publicly available annotated corpora from other experts previously. A total of 400 full biomedical articles from PubMed are annotated based on our home-made entity annotation guidelines. Both a BERT-based large model and a DistilBERT-based simplified model were constructed, trained and optimized for offline and online inference, respectively. The F1-scores of NER of gene, variant, disease and species for the BERT-based model are 97.28%, 93.52%, 92.54% and 95.76%, respectively, while those for the DistilBERT-based model are 95.14%, 86.26%, 91.37% and 89.92%, respectively. The F1 scores of the DistilBERT-based NER model retains 97.8%, 92.2%, 98.7% and 93.9% of those of BERT-based NER for gene, variant, disease and species, respectively. Moreover, the performance for both our BERT-based NER model and DistilBERT-based NER model outperforms that of the state-of-art model,BioBERT, indicating the significance to train an NER model on biomedical-domain literatures jointly with high-quality annotated datasets.

Download Full-text

Person name entity recognition for Arabic

10.3115/1654576.1654581 ◽

2007 ◽

Cited By ~ 20

Author(s):

Khaled Shaalan ◽

Hafsa Raza

Keyword(s):

Name Entity Recognition ◽

Entity Recognition ◽

Name Entity

Download Full-text