Assessing the reliability of automatic sentiment analysis tools on rating the sentiment of reviews of NHS dental practices in England

Byrne, Matthew and O’Malley, Lucy and Glenny, Anne-Marie and Pretty, Iain and Tickle, Martin and Consoli, Sergio (2021) Assessing the reliability of automatic sentiment analysis tools on rating the sentiment of reviews of NHS dental practices in England. PLOS ONE, 16 (12). e0259797. ISSN 1932-6203

[thumbnail of journal.pone.0259797.pdf] Text
journal.pone.0259797.pdf - Published Version

Download (430kB)

Abstract

Background
Online reviews may act as a rich source of data to assess the quality of dental practices. Assessing the content and sentiment of reviews on a large scale is time consuming and expensive. Automation of the process of assigning sentiment to big data samples of reviews may allow for reviews to be used as Patient Reported Experience Measures for primary care dentistry.

Aim
To assess the reliability of three different online sentiment analysis tools (Amazon Comprehend DetectSentiment API (ACDAPI), Google and Monkeylearn) at assessing the sentiment of reviews of dental practices working on National Health Service contracts in the United Kingdom.

Methods
A Python 3 script was used to mine 15800 reviews from 4803 unique dental practices on the NHS.uk websites between April 2018 –March 2019. A random sample of 270 reviews were rated by the three sentiment analysis tools. These reviews were rated by 3 blinded independent human reviewers and a pooled sentiment score was assigned. Kappa statistics and polychoric evalutaiton were used to assess the level of agreement. Disagreements between the automated and human reviewers were qualitatively assessed.

Results
There was good agreement between the sentiment assigned to reviews by the human reviews and ACDAPI (k = 0.660). The Google (k = 0.706) and Monkeylearn (k = 0.728) showed slightly better agreement at the expense of usability on a massive dataset. There were 33 disagreements in rating between ACDAPI and human reviewers, of which n = 16 were due to syntax errors, n = 10 were due to misappropriation of the strength of conflicting emotions and n = 7 were due to a lack of overtly emotive language in the text.

Conclusions
There is good agreement between the sentiment of an online review assigned by a group of humans and by cloud-based sentiment analysis. This may allow the use of automated sentiment analysis for quality assessment of dental service provision in the NHS.

Item Type: Article
Subjects: Impact Archive > Social Sciences and Humanities
Depositing User: Managing Editor
Date Deposited: 13 Mar 2023 05:57
Last Modified: 06 May 2024 06:02
URI: http://research.sdpublishers.net/id/eprint/795

Actions (login required)

View Item
View Item