Steps in Building a Machine Learning Project

Subscribe to my newsletter and never miss my upcoming articles

In this post, we will be discussing the basic steps in building a machine learning project. The process of building an ML project includes collecting the dataset, cleaning it, processing the data, training and testing of the model and deploying it. All the steps are explained briefly below.

Data collection

Data collection is the first step in developing any machine learning model. The source and type of data play an important role in determining whether the upcoming steps will be easy or complex. If we acquire data with less missing values, with only the most important features which determine the output and less ambiguous data, it would be easy for us to process it in the further steps.

Data cleaning

In data cleaning, we check for any missing values, duplicate rows, features that don't have a high influence on the output and other unwanted values. The relations between the remaining input and output features are studied through plots and graphs. We remove these unnecessary data and make our dataset clean.

Data preprocessing

Preprocessing of data means to find a suitable measure to bring the input features into a common standard like mean, median, mode etc. The categorical features if any should be encoded to numerical values.

Model Training

After the data is cleaned and processed, now we split the dataset into train data and test data. The training data is used to train the model while the test data is used to test how much our model performs and calculate its accuracy scores. In most cases, this ratio is taken as 70:30 or 80:20. It all depends on the use case.


Here in this step, the test data is fed into the already trained model. The efficiency of our model is determined through the scores we obtain at the output. There are various parameters like accuracy, F1-score, r2 score etc. which determines how much accurate is our model.


This is the final stage in a Machine Learning project. Once our model meets the standard set by us, it is time to deploy them. There are various platforms like Heroku, Netlify etc. which may be chosen as per the requirement of our project.

So that is it for this post. Hope you learnt something new out of it. Thank you for reading.😊

Comments (2)

Manish⚡Nayak's photo

It totally guides!🙂 Well written🌟