Business Problem

Retail companies receive a large number of customer support requests every day through:

  1. support portals
  2. helpdesk systems
  3. chatbots
  4. mobile applications

Each request must be routed to the correct department such as billing, delivery, or technical support.

Goal

The goal of this assignment is to build a machine learning system that automatically classifies customer support tickets into the correct department using Natural Language Processing (NLP).

Dataset

Click here to download the dataset.

Classification Problem

This is a multi-class text classification problem, ML model must be able to classify the given text to one of the class.

Technologies and Tools

Here are the list of tools can be used  to build the ML system.

Data Analysis

  1. Pandas
  2. Numpy

NLP Libraries

  1. spaCy
  2. NLTK

Vectorization Techniques

  1. TF-IDF using TfidfVectorizer
  2. Bag-of-Words using CountVectorizer
  3. spacy

Models

  1. Logistic Regression
  2. Multinomial Naive Bayes
  3. Support Vector Machine (SVM)
  4. Random Forest

Data Processing

Before training the machine learning model, the dataset must be cleaned and processed using NLP techniques.

Data Processing stage to remove unwanted tokens in the dataset may include

  1. remove unwanted features
  2. converting text to lowercase
  3. removing punctuation and special characters
  4. removing stop words
  5. optional: lemmatization or stemming

Text Vectorization

Machine learning models cannot understand raw text, hence must convert the text data into numerical feature vectors using vectorization techniques.

ML Training

split the dataset into training and testing sets and Train the ML  models with different possible hyper-parameters

ML Evaluation

Evaluate the Model performance using sci-kit metrics and save the best model  best hyper-parameters and best vectorization technique that giving high accuracy.

Model Inference

Build a simple interface using streamlit and inference the Model to classify the given text by the user.

Notes: Once the ML system is ready and tested share the GitHub link for validation and comment below which model is the best.

0 Shares:
Leave a Reply

Your email address will not be published. Required fields are marked *