Blog about Programming Languages & Coding

Blog about Programming Languages & Coding
Contents for Computer Science, IT, B.Sc. CS & IT, M.Sc. CS & IT, MCA, BE CS & IT, ME CS & IT , Interview Questions, Books and Online Course Recommendations from Udemy, Coursera, etc

Twitter Hashtag Sentiment Analysis

 Twitter Hashtag Sentiment Analysis

Twitter is what's happening and what people are talking about right now. It is a global platform for public self-expression and conversation in real-time. It provides a network that connects users to people, information, ideas, opinions, and news.

This project provides different functions to analyze tweets and segregate them on the basis of the sentiment of each tweet. We can search for any Twitter hashtag and any number of tweets. Using this we can get an overview of people’s reactions to many topics and events. It can also help Data Analysts for getting information out of a particular hashtag.

The language used for creating this project in Python. 


Introduction:

Sentiment analysis is a natural language processing technique that involves the use of algorithms and machine learning models to identify, extract, and classify subjective information from text, such as opinions, attitudes, emotions, and feelings. It is used to analyze the sentiment or tone of a piece of text, typically by categorizing it as positive, negative, or neutral.

Sentiment analysis is commonly used in various applications, including customer feedback analysis, brand reputation management, social media monitoring, political analysis and market research.

Overall, sentiment analysis is a valuable tool for businesses, researchers, and individuals who want to gain insights into how people feel and think about a particular topic or brand.

On Twitter, a hashtag is a word or phrase preceded by the hash symbol (#) that is used to identify a specific topic or theme. When a user includes a hashtag in their tweet, it makes it easier for others to discover and join in on the conversation about that topic. Users can click on a hashtag to view all tweets that include that specific hashtag, even if they don't follow the user who tweeted it.

Hashtags can also be used to participate in or follow specific events, campaigns, or social movements on Twitter. They can help to increase the reach and visibility of a tweet and make it more likely to be seen by a larger audience.

 

Libraries Used:

nltk: The Natural Language Toolkit (NLTK) is a platform used for building Python programs that work with human language data for applying in statistical natural language processing (NLP). It contains text processing libraries for tokenization, parsing, classification, stemming, tagging and semantic reasoning.

snscrape: Snscrape is a scraper for social networking services (SNS). It scrapes things like user profiles, hashtags, or searches and returns the discovered items.

googletrans: Googletrans is a free and unlimited python library that implemented Google Translate API. This uses the Google Translate Ajax API to make calls to such methods as detect and translate.

string: String module contains some constants, utility function, and classes for string manipulation.

re: A Regular Expressions (RegEx) is a special sequence of characters that uses a search pattern to find a string or set of strings.

wordcloud: Word Cloud is a data visualization technique used for representing text data in which the size of each word indicates its frequency or importance.

matplotib: Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.

tkinter: Tkinter is the standard GUI library for Python. Python when combined with Tkinter provides a fast and easy way to create GUI applications.

 

Code(sentimentAnalysis.py):

import snscrape.modules.twitter as sntwitter

from googletrans import Translator

import string

import re

from wordcloud import WordCloud

import matplotlib.pyplot as plt

from nltk.corpus import stopwords

from nltk.tokenize import word_tokenize

from nltk.sentiment.vader import SentimentIntensityAnalyzer

 

 

def stop_word_removal(sent):

    stop_words = set(stopwords.words("english"))

    word_tokens = word_tokenize(sent)

    swr = [word for word in word_tokens if word.lower() not in stop_words]

    clear = ' '.join(swr)

    return clear

 

 

# Extracting Tweets and translating them in English

def orig_tweets(hashtag, num):

    tweets = []

    translator = Translator()

 

    for tweet in sntwitter.TwitterSearchScraper(hashtag).get_items():

        if len(tweets) == num:

            break

        else:

            translated = translator.translate(tweet.rawContent, dest="en")

            tweets.append(translated.text)

 

    return tweets

 

 

# Cleaning the extracted tweets

def clean_tweets(tweets):

    cleared = []

    for i in range(len(tweets)):

        sent = tweets[i]

        lower_case = sent.lower()

        lower_case = stop_word_removal(lower_case)

        cleaning = lower_case.translate(str.maketrans('', '', string.punctuation))

        cleaning = re.sub(r'@[A-Za-z0-9]+', '', cleaning)

        cleaning = re.sub(r'#', '', cleaning)

        cleaning = re.sub(r'RT[\s]+', '', cleaning)

        cleaning = re.sub(r'https?:\/\/\S+', '', cleaning)

        cleared.append(cleaning)

 

    return cleared

 

 

# Performing sentiment analysis on the cleaned Tweets

def senti(ctweets):

    sent_analyzer = SentimentIntensityAnalyzer()

    sentimentsList = []

    for text in ctweets:

        analysis = sent_analyzer.polarity_scores(text)

        sentiment = analysis["compound"]

        sentimentsList.append(sentiment)

 

    return sentimentsList

 

 

# Segregating tweets into Positive, Negative and Neutral

def segregate_tweets(sen, tweets):

    positive = []

    posiPol = []

    negative = []

    negPol = []

    neutral = []

    neutPol = []

    for i in range(len(sen)):

        if sen[i] >= 0.5:

            posiPol.append(sen[i])

            positive.append(tweets[i])

        elif sen[i] <= -0.5:

            negPol.append(sen[i])

            negative.append(tweets[i])

        else:

            neutPol.append(sen[i])

            neutral.append(tweets[i])

 

    return positive, posiPol, negative, negPol, neutral, neutPol

 

 

# For plotting a bar graph

def bar_graph(p, ne, n, h):

    values = [p, ne, n]

    attributes = ["Positive", "Neutral", "Negative"]

 

    plt.figure(figsize=(10, 5))

    plt.bar(attributes, values, color='maroon', width=0.4)

    plt.title("Bar Graph of #" + h)

    plt.xlabel("Emotions")

    plt.ylabel("Number of Tweets")

    plt.show()

 

 

# For plotting a pie chart

def pie_chart(p, ne, n, h):

    values = [p, ne, n]

    attributes = ["Positive", "Neutral", "Negative"]

 

    plt.title("Pie distribution of #" + h)

    plt.pie(values, labels=attributes, autopct='%2.1f%%')

    plt.show()

 

 

# For creating a word cloud

def word_cloud(text):

    wordcloud = WordCloud(width=800, height=400, random_state=21, max_font_size=110).generate(text)

 

    plt.figure(figsize=(10, 5))

    plt.imshow(wordcloud, interpolation="bilinear")

    plt.axis('off')

    plt.show()

 

 Code(analysisWindow.py)

from tkinter import *

from tkinter import messagebox

import sentimentAnalysis as sa


window = Tk()

window.title("Twitter Sentiment Analysis")

window.geometry('530x250')


positive_tweets = []

negative_tweets = []

neutral_tweets = []

all_words = ""


hashtag_label = Label(window, text="Enter a Twitter Hashtag", fg="blue", font=("Arial", 16))

hashtag_label.grid(row=0, column=0, padx=10, pady=10)


hashtag = Entry(window, background="white", fg="black", font=("Arial", 16))

hashtag.grid(row=0, column=1, padx=10, pady=10)


num_label = Label(window, text="Number of Tweets", fg="blue", font=("Arial", 16))

num_label.grid(row=1, column=0, padx=10, pady=10)


tweet_num = Entry(window, background="white", fg="black", font=("Arial", 16))

tweet_num.grid(row=1, column=1, padx=10, pady=10)



def analyzing():

    try:

        global all_words, positive_tweets, neutral_tweets, negative_tweets

        hash_text = hashtag.get()

        num = int(tweet_num.get())

        tweets = sa.orig_tweets(hash_text, num)

        cleaned_tweets = sa.clean_tweets(tweets)

        all_words = ' '.join(cleaned_tweets)

        polarity = sa.senti(cleaned_tweets)

        positive_tweets, positive_polarity, negative_tweets, negative_polarity, neutral_tweets, neutral_polarity = sa.segregate_tweets(polarity, tweets)

        sa.bar_graph(len(positive_tweets), len(neutral_tweets), len(negative_tweets), hash_text)

        sa.pie_chart(len(positive_tweets), len(neutral_tweets), len(negative_tweets), hash_text)

    except:

        messagebox.showerror("Error", "Wrong Input")



analyze = Button(window, text="Analyze", fg="red", background="yellow", font=("Arial", 16), command=analyzing)

analyze.grid(row=3, column=1, padx=10, pady=10)



def show_tweets():

    win = Tk()

    win.title("Tweets")

    win.geometry('1000x650')

    main_frame = Frame(win)

    main_frame.pack(fill=BOTH, expand=1)


    canvas = Canvas(main_frame)

    canvas.pack(side=LEFT, fill=BOTH, expand=1)


    scbar = Scrollbar(main_frame, orient=VERTICAL, command=canvas.yview)

    scbar.pack(side=RIGHT, fill=Y)


    canvas.configure(yscrollcommand=scbar.set)

    canvas.bind('<Configure>', lambda e: canvas.configure(scrollregion=canvas.bbox("all")))


    second_frame = Frame(canvas)

    canvas.create_window((0, 0), window=second_frame, anchor="nw")


    l1 = "Positive Tweets(" + str(len(positive_tweets)) + ")"

    l2 = "Negative Tweets(" + str(len(negative_tweets)) + ")"

    l3 = "Neutral Tweets(" + str(len(neutral_tweets)) + ")"


    posi_list = "Tweet--> " + '\nTweet--> '.join(positive_tweets)

    neg_list = "Tweet--> " + '\nTweet--> '.join(negative_tweets)

    neu_list = "Tweet--> " + '\nTweet--> '.join(neutral_tweets)


    p_label = Label(second_frame, text=l1, fg="green", font=("Arial", 18), background="yellow")

    p_label.grid(row=0, column=0, padx=10, pady=10)


    p_tweets = Text(second_frame, fg="black", font=("Arial", 16), background="white")

    p_tweets.grid(row=1, column=0, padx=10, pady=10)

    p_tweets.insert("1.0", posi_list)


    n_label = Label(second_frame, text=l2, fg="red", font=("Arial", 18), background="black")

    n_label.grid(row=2, column=0, padx=10, pady=10)


    n_tweets = Text(second_frame, fg="black", font=("Arial", 16), background="white")

    n_tweets.grid(row=3, column=0, padx=10, pady=10)

    n_tweets.insert("1.0", neg_list)


    nu_label = Label(second_frame, text=l3, fg="blue", font=("Arial", 18), background="lightblue")

    nu_label.grid(row=4, column=0, padx=10, pady=10)


    nu_tweets = Text(second_frame, fg="black", font=("Arial", 16), background="white")

    nu_tweets.grid(row=5, column=0, padx=10, pady=10)

    nu_tweets.insert("1.0", neu_list)



see_tweets = Button(window, text="Show Tweets", fg="black", background="green", font=("Arial", 16), command=show_tweets)

see_tweets.grid(row=4, column=0, padx=10, pady=10)



def show_word_cloud():

    if len(all_words) > 0:

        sa.word_cloud(all_words)

    else:

        messagebox.showwarning("Warning", "Analyze a hashtag first")



wordCloud = Button(window, text="Word Cloud", fg="black", background="lightblue", font=("Arial", 16), command=show_word_cloud)

wordCloud.grid(row=4, column=1, padx=10, pady=10)


window.mainloop()


 

Output:

After entering the hashtag and the number press analyze.

Right after the analyzing process is finished the bar graph and pie chart is presented.


For viewing all the tweets press Show Tweets button.
And for a wordcloud of the data press Wordcloud button.


 
Conclusion:

The code performs well and deliver the desired output. If the data is large the speed of process will be slow. Overall, it is fit for conducting sentiment analysis at a small level.

 


 

 

 

 

 

 

 

 

  


 

Twitter Hashtag Sentiment Analysis Twitter Hashtag Sentiment Analysis Reviewed by Asst. Prof. Sunita Rai on April 08, 2023 Rating: 5

No comments:

Powered by Blogger.