Ετικέτες

Πέμπτη 10 Μαΐου 2018

Extracting cancer mortality statistics from death certificates: A hybrid machine learning and rule-based approach for common and rare cancers

S09333657.gif

Publication date: Available online 10 May 2018
Source:Artificial Intelligence in Medicine
Author(s): Bevan Koopman, Guido Zuccon, Anthony Nguyen, Anton Bergheim, Narelle Grayson
ObjectiveDeath certificates are an invaluable source of cancer mortality statistics. However, this value can only be realised if accurate, quantitative data can be extracted from certificates—an aim hampered by both the volume and variable quality of certificates written in natural language. This paper proposes an automatic classification system for identifying all cancer related causes of death from death certificates.MethodsDetailed features, including terms, n-grams and SNOMED CT concepts were extracted from a collection of 447,336 death certificates. The features were used as input to two different classification sub-systems: a machine learning sub-system using Support Vector Machines (SVMs) and a rule-based sub-system. A fusion sub-system then combines the results from SVMs and rules into a single final classification. A held-out test set was used to evaluate the effectiveness of the classifiers according to precision, recall and F-measure.ResultsThe system was highly effective at determining the type of cancers for both common cancers (F-measure of 0.85) and rare cancers (F-measure of 0.7). In general, rules performed superior to SVMs; however, the fusion method that combined the two was the most effective.ConclusionThe system proposed in this study provides automatic identification and characterisation of cancers from large collections of free-text death certificates. This allows organisations such as Cancer Registries to monitor and report on cancer mortality in a timely and accurate manner. In addition, the methods and findings are generally applicable beyond cancer classification and to other sources of medical text besides death certificates.



https://ift.tt/2KSrPHh

Δεν υπάρχουν σχόλια:

Δημοσίευση σχολίου

Αναζήτηση αυτού του ιστολογίου