Title

Sentiment analysis of extremism in social media from textual information

Date of Publication

2020 12:00 AM

Security Theme

Violent Extremism

Keywords

Extremism, Multilingual, Lexicons, Multinomial Naïve Bayes Linear Support Vector Classifier, Social Media, languages

Description

Uncertainty in political, religious, and social issues causes extremism among people that are depicted by their sentiments on social media. Although, English is the most common language used to share views on social media, however, other vicinity based languages are also used by locals. Thus, it is also required to incorporate the views in such languages along with widely used languages for revealing better insights from data. This research focuses on the sentimental analysis of social media multilingual textual data to discover the intensity of the sentiments of extremism. Our study classifies the incorporated textual views into any of four categories, including high extreme, low extreme, moderate, and neutral, based on their level of extremism. Initially, a multilingual lexicon with the intensity weights is created. This lexicon is validated from domain experts and it attains 88% accuracy for validation. Subsequently, Multinomial Naïve Bayes and Linear Support Vector Classifier algorithms are employed for classification purposes. Overall, on the underlying multilingual dataset, Linear Support Vector Classifier out-performs with an accuracy of 82%.

Share

 
COinS
 
Jan 1st, 12:00 AM

Sentiment analysis of extremism in social media from textual information

Uncertainty in political, religious, and social issues causes extremism among people that are depicted by their sentiments on social media. Although, English is the most common language used to share views on social media, however, other vicinity based languages are also used by locals. Thus, it is also required to incorporate the views in such languages along with widely used languages for revealing better insights from data. This research focuses on the sentimental analysis of social media multilingual textual data to discover the intensity of the sentiments of extremism. Our study classifies the incorporated textual views into any of four categories, including high extreme, low extreme, moderate, and neutral, based on their level of extremism. Initially, a multilingual lexicon with the intensity weights is created. This lexicon is validated from domain experts and it attains 88% accuracy for validation. Subsequently, Multinomial Naïve Bayes and Linear Support Vector Classifier algorithms are employed for classification purposes. Overall, on the underlying multilingual dataset, Linear Support Vector Classifier out-performs with an accuracy of 82%.