Back to Projects
Overview
This project focuses on predicting delivery delays in e-commerce orders using machine learning. By analyzing historical order data, we can identify patterns and factors that contribute to delayed deliveries, enabling proactive interventions and improved customer satisfaction.
37.6%
Shipping Cost Impact
Problem Statement
E-commerce platforms face significant challenges with delivery delays, which directly impact customer satisfaction and retention. The goal of this project was to build a predictive model that can identify orders at risk of delay before they occur, allowing for proactive measures.
Key Questions:
- What factors contribute most to delivery delays?
- Can we predict delays with high accuracy?
- How can we use these insights to improve operations?
Data & Methodology
The dataset contains 100,756 Brazilian e-commerce orders with features including order details, payment information, customer location, and product characteristics. After cleaning, 96,478 delivered orders were used for analysis.
Feature Engineering:
- Order value and payment installments
- Shipping cost and delivery distance
- Seller history and ratings
- Time-based features (season, day of week)
- Geographic indicators
Models Tested:
- XGBoost (Best Performance: 92.7%)
- Random Forest (91.4%)
- Logistic Regression (Baseline)
Key Insights
Top Predictors of Delivery Delay:
- Shipping cost (37.6%) - Higher shipping costs correlate with faster delivery
- Order value (34.0%) - More expensive orders tend to arrive sooner
- Payment installments (28.3%) - More installments often indicate higher-value purchases
Geographic Patterns:
- Major cities (SP, RJ, MG) show faster delivery times
- Remote areas experience more delays
Results & Impact
The final XGBoost model achieved 92.7% accuracy in predicting delivery delays, with the following business implications:
- Proactive Alerts: System can flag at-risk orders 48 hours in advance
- Resource Allocation: Optimize logistics for high-risk regions
- Customer Communication: Set realistic expectations for delivery times
Conclusion
This project demonstrates how machine learning can optimize e-commerce logistics by predicting delivery delays, enabling better resource allocation and improved customer satisfaction.