We have loads and loads of text data sitting to be examined and analysed. But we cannot directly go ahead and use the raw text data as it is for our machine learning and deep learning models, it needs to be cleaned and preprocessed.
When we say the “data needs to be cleaned” it means that the data contains unecessary elements that need to be removed, some information needs to be fetched from the raw data that can further help us build the model more effectively.
We’ll see some of the most common methods to clean the text data and…
While working for text data there comes a time where we need to correct the spellings of the words in our corpus.
Datasets like customer reviews(movie reviews, hotel reviews, amazon reviews, etc) and conversation files, etc contain many typographical errors that need fixing for better analysis.
We are going to use a python library called textblob, for our python spell corrector. Using textblob library, we can create Machine Learning Models for the task of Spelling Corrections. Detecting actual word spelling errors is a much more difficult task, as any word in the input text can be an error.
Before we start making a chatbot, let’s learn some basic information about Chatbots
A computer program anybody can talk to with normal language.
No matter what type of chatbot it is, they all have a similar purpose — to take regular human language input, understand what is being said and to provide a relevant, correct answer based on the knowledge it has.
Chatbots excel at completing repetitive tasks and work around the clock. They can work alone or alongside humans, and are effective at completing 60–90% of an average human team’s workload, depending on the use case.
With this type…
We have IMDB data for movie reviews and their sentiment whether it’s positive or negative, we’ll use machine learning to create a binary classifier for the reviews.
We’ll start with downloading the IMDB dataset-
Let’s start with importing libraries :
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as snsimport re
from nltk.stem import PorterStemmerimport spacy
from spacy.lang.en.stop_words import STOP_WORDS as stopwordsfrom sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report
from sklearn.metrics import confusion_matrixfrom sklearn.ensemble import RandomForestClassifier from…
As a Machine Learning aspirant we need to read csv/excel files to perform data anaysis but then comes a time when we have to go deeper into loading data and we have to fetch data from a SQL database or a SQL table.
We’ll perform a basic dataloading from a MySQL database table to a pandas DataFrame.
We need the following things installed on your machine with basic knowledge of working on pandas and mysql:
We’ll start with installing packages…