Building My Churn Prediction App: What Worked, What Didn’t, and What I Learned
May 25, 2025 • 3 min read

Building My Churn Prediction App: What Worked, What Didn’t, and What I Learned #
1. Why I Built This #
Churn is one of those things companies hate but don’t always act on until it’s too late. I wanted to build something that could actually help with that — a tool that predicts when a customer might leave and gives some insight into why.
I’ve been learning data science and decided this would be a great project to push myself a bit. I didn’t just want to build a model in a notebook. I wanted to go all the way — data cleaning, model building, tuning, and actually deploying it as something people could interact with.
2. How I Built It #
I started with a telecom churn dataset and built three models:
- Logistic Regression
- Random Forest
- Tuned Random Forest (using
GridSearchCV
)
I didn’t just look at accuracy — churn is an imbalanced problem, so I focused more on:
- Recall for churners (catching them early matters)
- Precision (avoiding false alarms)
- ROC-AUC for overall separation
I used class weights to help balance things out and made sure to visualize ROC and Precision-Recall curves so I could actually understand how each model was behaving.
3. Trade-Offs and Decisions #
Logistic Regression had better recall — it flagged more potential churners, but with more false positives. Random Forest was more precise — fewer wrong guesses, but it missed some churners. The tuned version helped balance that out a bit.
Rather than trying to pick one “best” model, I built the app to let users choose between the three. That way, they can decide based on what matters more for their situation — catching every churner or being more cautious.
4. What Went Wrong (and How I Fixed It) #
This wasn’t all smooth sailing. Two big challenges came up:
Model Prediction Issues #
At first, the app’s predictions didn’t match what I was getting in my notebook. After a bit of head-scratching, I realized I hadn’t included all the .pkl
files needed for the app to work properly. Once I added the trained models, transformers, and encoders, things finally lined up.
SHAP Limitations #
I originally wanted to include SHAP explanations for all three models. Unfortunately, I could only get it to work reliably with Logistic Regression. For the Random Forest models, SHAP just wasn’t playing nice in the app.
So I pivoted. Instead of live explanations, I added static feature importance images that show the most influential features for those models. Not ideal, but still helpful.
5. What I Took Away from This #
Here’s what I learned:
- Just because something works in a notebook doesn’t mean it’ll work in an app — especially if you forget to package everything.
- SHAP is powerful, but also picky. Sometimes you have to adjust your plan.
- Explaining why a model made a decision helps a lot — even if it’s just a bar chart.
More than anything, I learned how to build something real from start to finish — and deploy it for others to try.
6. Give the App a Try #
If you’re curious to see how it works:
If you have questions or just want to connect, feel free to message me on LinkedIn.
Last updated on May 25, 2025