BookmarkSubscribeRSS Feed

Fake News Detection

Started ‎02-15-2021 by
Modified ‎10-20-2022 by
Views 3,471
                                                                                                            
 
Team Name Nupeak Tachyon
Track START UP
Use Case Fake News Detection
Technology NLP, ML
Region India
Team lead Jatin Pithva @Jatin_P 
Team members @uttam631 @N_AK  @toshi @HITESH_MALI 

 

Introduction

The authenticity of Information has become a longstanding issue affecting businesses and society, both for printed and digital media. On social networks, the reach and effects of information spread occur at such a fast pace and so amplified that distorted, inaccurate or false information acquires a tremendous potential to cause real world impacts, within minutes, for millions of users. Recently, several public concerns about this problem and some approaches to mitigate the problem were expressed.

Fake news refers to misinformation or disinformation in the country which is spread through word of mouth and traditional media and more recently through digital forms of communication such as edited videos, memes, unverified advertisements and social media propagated rumors. In this project, we discuss the problem by presenting the proposals into categories:

  1. Content Based
  2. Source Based
  3. Diffusion Based

We describe two opposite approaches and propose an algorithmic solution that synthesizes the main concerns. We conclude the paper by raising awareness about concerns and opportunities for businesses that are currently on the quest to help automatically detecting fake news.

 

Goal

The main objective is to detect the fake news, which is a classic text classification problem with a straight forward proposition. It is needed to build a model that can differentiate between “Real” news and “Fake” news and identify the source that publish fake news simultaneously.

For Different point of view:

  1. Citizens- Citizen to use this tool to identified fake news or they will see the source of fake news and take alert on email the content of fake news.
  2. Government- Concern government Authority will use this model to identify the source of the fake news and take immediate action.
  3. Publishers- They will identify if someone using his broadcaster name for unauthenticated propaganda.

 

Tools and Technology

  1. SAS Visual Text Analytics
  2. SAS VDDML
  3. Python
  4. SAS Viya
  5. SAS Studio
  6. SAS Visual Analytics
  7. Microsoft Azure

Process Flow Diagram

 

uttam631_0-1613051691277.png

 

 

Project Implementation Approach

uttam631_1-1613051691282.png

 

Description

  1. Crawl the news from the different source URL.
  2. Create a crawled news data in tabular format.
  3. Data Cleansing- Articles with no body text or having less than 10 words in the article body are removed. These operations are performed on all the datasets to achieve consistency of format and structure. Once the relevant attributes are selected after the data cleaning and exploration phase.
  4. Linguistic features- Linguistic features involved certain textual characteristics converted into a numerical form such that they can be used as an input for the training models.
  5. Feature Selection- Select the correlated variables that are important for model.
  6. The input features will used to train the different machine learning models. Each dataset is divided into training and testing data with a 70/30 split.
  7. The learning algorithms are trained with different hyperparameters to achieve maximum accuracy for a given dataset, with an optimal balance between variance and bias.
  8. Compare the output of all that models that we created.
  9. Identify the best fit model.
  10. Give the final conclusion of the best fit model based on their output whether the news is true or fake.

Model Implementation Approach

We have shown the Model output.

 

Confusion Matrix:

    Predicted Class Predicted Class
    REAL FALSE

Actual Class

REAL

TRUE POSITIVE

FALSE NEGATIVE

Actual Class

FALSE

FALSE NEGATIVE

TRUE NEGATIVE

Visualization Report:

uttam631_2-1613051691284.png

 

Conclusion

With the increasing popularity of social media, more and more people consume news from social media instead of traditional news media. However, social media has also been used to spread fake news, which has strong negative impacts on individual users and broader society.

The task of classifying news manually requires in-depth knowledge of the domain and expertise to identify anomalies in the text. In this project, we discussed the problem of classifying fake news articles using machine learning models and ensemble techniques. The data we used in our work is collected from the different sources URL and contains news articles from various domains. The primary aim of the project is to identify patterns in text that differentiate fake articles from true news. We extracted different textual features from the articles using a different SAS tools and used the feature set as an input to the models. The learning models were trained and parameter-tuned to obtain optimal accuracy. Some models have achieved comparatively higher accuracy than others. We used multiple performance metrics to compare the results for each algorithm. The ensemble learners have shown an overall better score on all performance metrics as compared to the individual learners.

Fake news detection has many open issues that require. For instance, in order to reduce the spread of fake news, identifying key elements involved in the spread of news is an important step. Machine learning techniques can be employed to identify the key sources involved in spread of fake news.

In order to detect accurately fake news, we check news from our model and Identify the source of the fake news who is publishing continuously as well as Identify the categories of the fake news. The model will also help to identify the probability rate of spread fake news.

It will also help to citizen & govt. to identify the news whether it’s true or not so it is helpful in meaningful way:

  • Reduced media noise
  • Increase in optimal use of resources
  • Improved public sentiment toward government’s handling of the upsurge.
  • Improve Business Sentiment to do smoothly

 

Comments

Team Name -  

Nupeak Tachyon

Fake News Detection

Team lead@Jatin_P

Team Members :    @toshi  @HITESH_MALI  @N_AK   @uttam631 

Hi Guys,

We have submitted our use case Video

Short Video .Long Video 

I Hope this use case helps to Government and public relation to strengthen as well as we hope jury member will consider our context. 

Our team  also thanks to SaS India & entire SaS hackathon team that  he helped us to thought different ways and guide us approach to achieve our use case.

Great work team and all the very best 🙂

Thanks a Lot..🙂

Version history
Last update:
‎10-20-2022 12:25 PM
Updated by:

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Article Tags