K.v.k Sasikanth, K. Samatha, N. Deshai, B. V. D. S. Sekhar, S. Venkatramana,
Volume 31, Issue 3 (IJIEPR 2020)
Abstract
The Today’s interconnected world generates huge digital data, while millions of users share their opinions, feelings on various topics through popular applications such as social media, different micro blogging sites, and various review sites on every day. Nowadays Sentiment Analysis on Twitter Data which is considered as a very important problem particularly for various organizations or companies who want to know the customers feelings and opinions about their products and services. Because of the data nature, variety and enormous size, it is very practical for several applications, range from choice and decision creation to product assessment. Tweets are being used to convey the sentiment of a tweeter on a specific topic. Those companies keeping survey millions of tweets on some kind of subjects to evaluate actual opinion and to know the customer feelings. This paper major goal would be to significantly collect, recognize, filter, reduce and analyze all such relevant opinions, emotions, and feelings of people on different product or service could be categorized into positive, negative or neutral because such categorization improves sales growth about a company's products or films, etc. We initiate that the Naïve Bayes classifier be the mainly utilized machine learning method for mining feelings from large data like twitter and popular social network because of its more accuracy rates. In this paper, we scrutinize sentiment polarity analysis on Twitter data in a distributed environment, known as Apache Spark.
A.k.v.k Sasikanthr, K. Samatha, N. Deshai, B.v.d.s Sekhar, S. Venkatramana,
Volume 32, Issue 1 (IJIEPR 2021)
Abstract
The Today’s digital world computations are tremendously difficult and always demands for essential requirements to significantly process and store enormous size of datasets for wide variety of applications. Since the volume of digital world data is enormous, this is mostly generated unstructured data with more velocity at beyond the limits and double day by day. In last decade, many organizations have been facing major problems to handling and process massive chunks of data, which could not be processed efficiently due to lack of enhancements on existing and conventional technologies. In this paper address, how to overcome these problems as efficiently by using the most recent and world primary powerful data processing tool, which is hadoop clean open source and one of the core component called Map Reduce, but which has few performance issues. This paper main goal is address and overcome the limitations and weaknesses of Map Reduce with Apache Spark.