Deploying a Machine Learning Model Using Streamlit

Deploying a Machine Learning Model Using Streamlit – Step by Step Guide

Streamlit is a great tool for creating interactive web apps for machine learning models with minimal coding. Below is a detailed step-by-step guide to deploy your model using Streamlit.

Step 1: Prepare Your Environment

Before we begin, ensure you have the required libraries installed.

bash
pip install streamlit pandas scikit-learn joblib

Streamlit → For building the web app
Pandas → For handling data
Scikit-learn → For machine learning model
Joblib → For saving and loading the trained model

Step 2: Train and Save the Model

First, train and save your machine learning model. Assume we are working with a Regression model predicting "Request Age".

📌 Train the Model (Example: Random Forest)

python
import pandas as pd
import joblib
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_absolute_error

# Load dataset
df = pd.read_csv("your_dataset.csv")

# Features (X) and target (y)
X = df[['LOB', 'Region', 'Catalog Item', 'Bus Sector', 'Bus Segment']]  # Categorical columns
y = df['Request Age']  # Target variable

# Encoding categorical variables
X = pd.get_dummies(X)

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = RandomForestRegressor(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Evaluate model
y_pred = model.predict(X_test)
print(f"Mean Absolute Error: {mean_absolute_error(y_test, y_pred)}")

# Save model
joblib.dump(model, "request_age_model.pkl")
joblib.dump(list(X.columns), "model_columns.pkl")  # Save column names for later use

Step 3: Create a Streamlit App

Now, create a Streamlit app to load the trained model and make predictions.

📌 Create a Python Script for Streamlit

Create a new file app.py and write the following code:

python
import streamlit as st
import pandas as pd
import joblib

# Load trained model and column names
model = joblib.load("request_age_model.pkl")
model_columns = joblib.load("model_columns.pkl")

# Streamlit UI
st.title("📊 Request Age Prediction App")
st.write("Enter the details below to predict the expected Request Age.")

# Input fields
lob = st.selectbox("Select Line of Business", ["LOB1", "LOB2", "LOB3", "LOB4", "LOB5", "LOB6", "LOB7"])
region = st.selectbox("Select Region", ["Region1", "Region2", "Region3", "Region4"])
catalog_item = st.selectbox("Select Catalog Item", ["Item1", "Item2", "Item3", "Item4", "Item5", "Item6", "Item7", "Item8", "Item9", "Item10"])
bus_sector = st.text_input("Enter Business Sector")
bus_segment = st.text_input("Enter Business Segment")

# Convert user inputs to DataFrame
input_data = pd.DataFrame([[lob, region, catalog_item, bus_sector, bus_segment]], 
                          columns=['LOB', 'Region', 'Catalog Item', 'Bus Sector', 'Bus Segment'])

# Encode categorical variables like in training
input_data = pd.get_dummies(input_data)

# Align columns with training data
for col in model_columns:
    if col not in input_data.columns:
        input_data[col] = 0  # Add missing columns

# Ensure column order matches training
input_data = input_data[model_columns]

# Predict button
if st.button("Predict Request Age"):
    prediction = model.predict(input_data)[0]
    st.success(f"📌 Predicted Request Age: {round(prediction, 2)} days")

Step 4: Run the Streamlit App

To start the app, run the following command in the terminal:

bash
streamlit run app.py

This will launch a web interface where users can input data and get predictions.

Step 5: Deploy the App Online

Once your app is working locally, you can deploy it on Streamlit Cloud or other platforms like Heroku.

📌 Deploy on Streamlit Cloud (Easy Way)

Push your code and model files to GitHub.
Go to Streamlit Cloud.
Click "New App", connect your GitHub repo.
Select the app.py file and click "Deploy".

Done! 🎉 Your model is now live!

🚀 Summary of Steps

✅ Step 1: Install dependencies
✅ Step 2: Train and save the model
✅ Step 3: Create a Streamlit app (app.py)
✅ Step 4: Run the app locally using streamlit run app.py
✅ Step 5: Deploy the app on Streamlit Cloud

💡 Next Steps

Customize the UI with images, charts, and better styling.
Allow users to upload CSV files for batch predictions.
Deploy on Heroku or AWS for more flexibility.

---------------------------------------------------------------------------------------------------------------

To use OneHotEncoder and TargetEncoder instead of pd.get_dummies in your Streamlit app, follow these steps:

🔹 Steps to Implement:

Train the OneHotEncoder and TargetEncoder during model training
Save the encoders using joblib
Load the encoders in Streamlit app
Apply encoding on user inputs dynamically before prediction

✅ Updated Streamlit Code with OneHotEncoder & TargetEncoder

python
%%writefile app.py
import streamlit as st
import pandas as pd
import joblib
from sklearn.preprocessing import OneHotEncoder
from category_encoders import TargetEncoder
import numpy as np

# Load trained model
model = joblib.load("request_age_model.pkl")

# Load saved encoders
onehot_encoder = joblib.load("onehot_encoder.pkl")  # OneHotEncoder
target_encoder = joblib.load("target_encoder.pkl")  # TargetEncoder
model_columns = joblib.load("model_columns.pkl")  # Column order for prediction

st.title("📊 Request Age Prediction App")
st.write("Enter details to predict the expected Request Age.")

# Input fields
lob = st.selectbox("Select Line of Business", ["LOB1", "LOB2", "LOB3"])
region = st.selectbox("Select Region", ["Region1", "Region2"])
catalog_item = st.selectbox("Select Catalog Item", ["Item1", "Item2"])
bus_sector = st.text_input("Enter Business Sector")
bus_segment = st.text_input("Enter Business Segment")

# Create input dataframe
input_data = pd.DataFrame([[lob, region, catalog_item, bus_sector, bus_segment]], 
                          columns=['LOB', 'Region', 'Catalog Item', 'Bus Sector', 'Bus Segment'])

# Apply OneHotEncoding to LOB, Region, and Catalog Item
onehot_encoded = onehot_encoder.transform(input_data[['LOB', 'Region', 'Catalog Item']])

# Apply TargetEncoding to Bus Sector and Bus Segment
target_encoded = target_encoder.transform(input_data[['Bus Sector', 'Bus Segment']])

# Convert OneHotEncoded output to dataframe
onehot_encoded_df = pd.DataFrame(onehot_encoded, columns=onehot_encoder.get_feature_names_out())

# Combine both encoded data
final_encoded_df = pd.concat([onehot_encoded_df, target_encoded], axis=1)

# Ensure all model columns are present
for col in model_columns:
    if col not in final_encoded_df.columns:
        final_encoded_df[col] = 0

# Reorder columns as per training model
final_encoded_df = final_encoded_df[model_columns]

# Prediction
if st.button("Predict"):
    prediction = model.predict(final_encoded_df)[0]
    st.success(f"📌 Predicted Request Age: {round(prediction, 2)} days")

🔹 Saving the Encoders During Model Training

Before using this Streamlit app, make sure you save the trained OneHotEncoder and TargetEncoder during model training:

python
from sklearn.preprocessing import OneHotEncoder
from category_encoders import TargetEncoder
import joblib

# Sample dataframe (Replace with actual dataset)
df = pd.DataFrame({
    'LOB': ['LOB1', 'LOB2', 'LOB3'],
    'Region': ['Region1', 'Region2', 'Region1'],
    'Catalog Item': ['Item1', 'Item2', 'Item1'],
    'Bus Sector': ['Sector1', 'Sector2', 'Sector3'],
    'Bus Segment': ['Segment1', 'Segment2', 'Segment3'],
    'Request Age': [20, 50, 35]
})

# OneHotEncoder for categorical features
onehot_encoder = OneHotEncoder(handle_unknown='ignore', sparse_output=False)
onehot_encoder.fit(df[['LOB', 'Region', 'Catalog Item']])

# TargetEncoder for high cardinality features
target_encoder = TargetEncoder()
target_encoder.fit(df[['Bus Sector', 'Bus Segment']], df['Request Age'])

# Save encoders
joblib.dump(onehot_encoder, "onehot_encoder.pkl")
joblib.dump(target_encoder, "target_encoder.pkl")

🔹 How to Run Streamlit App

After saving the encoders and model, run the Streamlit app using:

bash
streamlit run app.py

🔹 Summary

✔️ OneHotEncoder applied to LOB, Region, and Catalog Item
✔️ TargetEncoder applied to Bus Sector and Bus Segment
✔️ All encoders are pre-trained & loaded in Streamlit
✔️ Ensured same column structure as training for prediction

Data Science & Machine Learning

Deploying a Machine Learning Model Using Streamlit