---
title: "DataCo Global"
subtitle: "Supply Chain Analytics Operations Intelligence and Risk Report"
author: "Freda Erinmwingbovo, Data Analyst"
date: today
format:
html:
theme: [flatly, dataco.scss]
toc: true
toc-depth: 3
toc-title: "Report Contents"
toc-location: left
number-sections: false
code-fold: true
code-tools: true
code-summary: "Show Code"
page-layout: full
smooth-scroll: true
self-contained: true
execute:
warning: false
message: false
cache: true
jupyter: python3
jupyter-options:
python: "C:/Users/engrf/AppData/Local/Programs/Python/Python313/python.exe"
---
# DataCo Global: Supply Chain Analytics Report
**Prepared by:** Freda Erinmwingbovo, Data Analyst
**Prepared for:** Marcus Adeyemi, Chief Executive Officer DataCo Global
**Date:** February 2026
**Dataset:** 180,519 orders · 53 attributes · Global supply chain operations
---
## Executive Summary
DataCo Global processed **180,519 orders** across multiple global markets.
This report investigates seven operational questions covering sales
performance, delivery risk, product profitability, discount abuse,
customer value, shipping efficiency, and market loss exposure.
```{=html}
<div class="kpi-grid">
<div class="kpi-card">
<div class="kpi-value">180,519</div>
<div class="kpi-label">Total Orders</div>
</div>
<div class="kpi-card">
<div class="kpi-value">$36.8M</div>
<div class="kpi-label">Total Sales</div>
</div>
<div class="kpi-card">
<div class="kpi-value">$3.97M</div>
<div class="kpi-label">Total Profit</div>
</div>
<div class="kpi-card">
<div class="kpi-value alert">57.3%</div>
<div class="kpi-label">Late Delivery Rate</div>
</div>
<div class="kpi-card">
<div class="kpi-value amber">10.2%</div>
<div class="kpi-label">Avg Discount Rate</div>
</div>
<div class="kpi-card">
<div class="kpi-value">5</div>
<div class="kpi-label">Markets Served</div>
</div>
</div>
```
| # | Business Question | Priority |
|---|------------------|----------|
| 1 | Sales and profit by market and region | Baseline |
| 2 | Shipping modes and late delivery risk | Critical |
| 3 | Product category and department profitability | High |
| 4 | Discount impact on profitability | High |
| 5 | Highest-value customers and segments | High |
| 6 | Scheduled vs actual shipping and profitability | Medium |
| 7 | Market loss exposure and risk | Critical |
---
## Data and Methodology
### Data Loading and Preparation
```{python}
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import warnings
warnings.filterwarnings('ignore')
try:
df = pd.read_csv('data_co_supply_chain.csv', encoding='utf-8')
except:
df = pd.read_csv('data_co_supply_chain.csv', encoding='latin-1')
# Drop sensitive and empty columns
drop_cols = [
'Customer Email', 'Customer Password', 'Product Description',
'Product Image', 'Customer Street', 'Customer Zipcode',
'Order Zipcode', 'Latitude', 'Longitude'
]
df = df.drop(columns=[c for c in drop_cols if c in df.columns])
# Parse dates
df['order date (DateOrders)'] = pd.to_datetime(
df['order date (DateOrders)'], errors='coerce'
)
df['shipping date (DateOrders)'] = pd.to_datetime(
df['shipping date (DateOrders)'], errors='coerce'
)
# Derived columns
df['Order Year'] = df['order date (DateOrders)'].dt.year
df['Order Month'] = df['order date (DateOrders)'].dt.month
df['Order Month Name'] = df['order date (DateOrders)'].dt.strftime('%b')
df['Delivery Delay'] = (
df['Days for shipping (real)'] -
df['Days for shipment (scheduled)']
)
df['Is Late'] = (df['Delivery Delay'] > 0).astype(int)
print(f"Dataset: {df.shape[0]:,} rows × {df.shape[1]} columns")
print(f"Date range: {df['order date (DateOrders)'].min().date()} "
f"to {df['order date (DateOrders)'].max().date()}")
print(f"Markets: {df['Market'].nunique()}")
print(f"Regions: {df['Order Region'].nunique()}")
print(f"Products: {df['Product Name'].nunique()}")
print(f"Customers: {df['Customer Id'].nunique():,}")
```
### Portfolio Overview
```{python}
total_sales = df['Sales'].sum()
total_profit = df['Order Profit Per Order'].sum()
profit_margin = total_profit / total_sales * 100
late_rate = df['Is Late'].mean() * 100
avg_discount = df['Order Item Discount Rate'].mean() * 100
print("DATACO GLOBAL OPERATIONS OVERVIEW")
print("=" * 45)
print(f"Total orders: {df['Order Id'].nunique():,}")
print(f"Total sales: ${total_sales:,.2f}")
print(f"Total profit: ${total_profit:,.2f}")
print(f"Profit margin: {profit_margin:.1f}%")
print(f"Late delivery rate: {late_rate:.1f}%")
print(f"Avg discount rate: {avg_discount:.1f}%")
print(f"Markets: {df['Market'].nunique()}")
print(f"Shipping modes: {df['Shipping Mode'].nunique()}")
print(f"Order statuses: {df['Order Status'].nunique()}")
```
### Colour Palette
```{python}
NAVY = '#1a3a5c'
TEAL = '#1a6b5a'
AMBER = '#d4722a'
RED = '#c0392b'
GOLD = '#c9a84c'
MUTED = '#6a6a7a'
BG = '#f8f7f4'
WHITE = '#ffffff'
BORDER = '#e0dbd0'
```
---
# Analysis
::: {.panel-tabset}
## I. Sales & Profit by Market
### Business Question
How are DataCo Global's sales, profit, and order volume
distributed across global markets and regions?
### Sales by Market
```{python}
mkt = df.groupby('Market').agg(
Orders = ('Order Id', 'nunique'),
Sales = ('Sales', 'sum'),
Profit = ('Order Profit Per Order', 'sum'),
AvgOrder = ('Sales', 'mean')
).reset_index()
mkt['ProfitMargin'] = (mkt['Profit'] / mkt['Sales'] * 100).round(2)
mkt = mkt.sort_values('Sales', ascending=False)
display(mkt)
```
### Sales by Region (Top 15)
```{python}
reg = df.groupby('Order Region').agg(
Orders = ('Order Id', 'nunique'),
Sales = ('Sales', 'sum'),
Profit = ('Order Profit Per Order', 'sum')
).reset_index()
reg['ProfitMargin'] = (reg['Profit'] / reg['Sales'] * 100).round(2)
reg = reg.sort_values('Sales', ascending=False).head(15)
display(reg)
```
### Monthly Sales Trend
```{python}
monthly = df.groupby(['Order Year', 'Order Month',
'Order Month Name']).agg(
Sales = ('Sales', 'sum'),
Profit = ('Order Profit Per Order', 'sum'),
Orders = ('Order Id', 'nunique')
).reset_index().sort_values(['Order Year', 'Order Month'])
display(monthly)
```
### Market and Sales Visualization
```{python}
fig, axes = plt.subplots(1, 2, figsize=(16, 6))
fig.patch.set_facecolor(BG)
fig.suptitle('Figure I: Sales and Profit by Market',
color=NAVY, fontsize=13,
fontfamily='serif', fontweight='bold', y=1.02)
# Plot 1: Sales by market
ax1 = axes[0]
ax1.set_facecolor(WHITE)
m_colors = [TEAL if i == 0 else '#3a5a7c'
for i in range(len(mkt))]
bars = ax1.barh(range(len(mkt)), mkt['Sales'],
color=m_colors, edgecolor='none', height=0.65)
ax1.set_yticks(range(len(mkt)))
ax1.set_yticklabels(mkt['Market'], color=MUTED, fontsize=9)
ax1.set_xlabel('Total Sales (USD)', color=MUTED)
ax1.set_title('Total Sales by Market',
color=NAVY, fontweight='bold',
pad=12, fontfamily='serif')
ax1.tick_params(colors=MUTED)
for spine in ax1.spines.values():
spine.set_color(BORDER)
ax1.grid(color=BORDER, linestyle='--',
linewidth=0.5, axis='x')
ax1.invert_yaxis()
ax1.xaxis.set_major_formatter(
mticker.FuncFormatter(lambda x, _: f'${x/1e6:.1f}M')
)
# Plot 2: Profit margin by market
ax2 = axes[1]
ax2.set_facecolor(WHITE)
pm_colors = [RED if p < 0 else TEAL
for p in mkt['ProfitMargin']]
bars2 = ax2.barh(range(len(mkt)), mkt['ProfitMargin'],
color=pm_colors, edgecolor='none', height=0.65)
ax2.set_yticks(range(len(mkt)))
ax2.set_yticklabels(mkt['Market'], color=MUTED, fontsize=9)
ax2.set_xlabel('Profit Margin (%)', color=MUTED)
ax2.set_title('Profit Margin by Market',
color=NAVY, fontweight='bold',
pad=12, fontfamily='serif')
ax2.tick_params(colors=MUTED)
for spine in ax2.spines.values():
spine.set_color(BORDER)
ax2.grid(color=BORDER, linestyle='--',
linewidth=0.5, axis='x')
ax2.invert_yaxis()
ax2.axvline(x=0, color=MUTED, linewidth=0.8,
linestyle='-')
plt.tight_layout()
plt.show()
```
### Market Findings and Recommendations
**Recommendations:**
- Prioritise investment and marketing spend in the
highest-revenue markets to defend market position
- Investigate markets with below-average profit margins
for pricing, discount, or operational inefficiencies
- Set quarterly revenue targets by market with
escalation triggers if performance falls below plan
---
## II. Delivery Performance
### Business Question
Which shipping modes and markets carry the highest
late delivery risk? What is the financial impact
of late deliveries?
### Delivery Performance by Shipping Mode
```{python}
ship = df.groupby('Shipping Mode').agg(
Orders = ('Order Id', 'nunique'),
AvgReal = ('Days for shipping (real)', 'mean'),
AvgSched = ('Days for shipment (scheduled)', 'mean'),
AvgDelay = ('Delivery Delay', 'mean'),
LateCount = ('Is Late', 'sum'),
LateRate = ('Is Late', 'mean'),
Sales = ('Sales', 'sum'),
Profit = ('Order Profit Per Order', 'sum')
).reset_index()
ship['LateRate'] = (ship['LateRate'] * 100).round(2)
ship = ship.sort_values('LateRate', ascending=False)
display(ship)
```
### Late Delivery by Market
```{python}
late_mkt = df.groupby('Market').agg(
Orders = ('Order Id', 'nunique'),
Late = ('Is Late', 'sum'),
LateRate = ('Is Late', 'mean')
).reset_index()
late_mkt['LateRate'] = (late_mkt['LateRate'] * 100).round(2)
late_mkt = late_mkt.sort_values('LateRate', ascending=False)
display(late_mkt)
```
### Delivery Status Distribution
```{python}
del_stat = df.groupby('Delivery Status').agg(
Orders = ('Order Id', 'nunique'),
Sales = ('Sales', 'sum'),
Profit = ('Order Profit Per Order', 'sum')
).reset_index().sort_values('Orders', ascending=False)
del_stat['ProfitMargin'] = (
del_stat['Profit'] / del_stat['Sales'] * 100
).round(2)
display(del_stat)
```
### Delivery Performance Visualization
```{python}
fig, axes = plt.subplots(1, 2, figsize=(16, 6))
fig.patch.set_facecolor(BG)
fig.suptitle('Figure II: Delivery Performance Analysis',
color=NAVY, fontsize=13,
fontfamily='serif', fontweight='bold', y=1.02)
# Plot 1: Late rate by shipping mode
ax1 = axes[0]
ax1.set_facecolor(WHITE)
s_colors = [RED if r == ship['LateRate'].max()
else TEAL for r in ship['LateRate']]
bars = ax1.bar(ship['Shipping Mode'], ship['LateRate'],
color=s_colors, edgecolor='none', width=0.55)
ax1.set_ylabel('Late Delivery Rate (%)', color=MUTED)
ax1.set_title('Late Rate by Shipping Mode',
color=NAVY, fontweight='bold',
pad=12, fontfamily='serif')
ax1.tick_params(colors=MUTED, labelsize=8)
for spine in ax1.spines.values():
spine.set_color(BORDER)
ax1.grid(color=BORDER, linestyle='--',
linewidth=0.5, axis='y')
ax1.set_xticklabels(ship['Shipping Mode'],
rotation=15, ha='right')
for bar, val in zip(bars, ship['LateRate']):
ax1.text(bar.get_x() + bar.get_width()/2,
bar.get_height() + 0.3,
f'{val:.1f}%',
ha='center', color=NAVY,
fontsize=9, fontweight='bold')
# Plot 2: Delivery status distribution
ax2 = axes[1]
ax2.set_facecolor(WHITE)
d_colors = [RED if 'Late' in s else
TEAL if 'Advance' in s or 'time' in s
else AMBER
for s in del_stat['Delivery Status']]
bars2 = ax2.bar(range(len(del_stat)),
del_stat['Orders'],
color=d_colors, edgecolor='none',
width=0.6)
ax2.set_xticks(range(len(del_stat)))
ax2.set_xticklabels(del_stat['Delivery Status'],
rotation=20, ha='right',
color=MUTED, fontsize=8)
ax2.set_ylabel('Number of Orders', color=MUTED)
ax2.set_title('Orders by Delivery Status',
color=NAVY, fontweight='bold',
pad=12, fontfamily='serif')
ax2.tick_params(colors=MUTED)
for spine in ax2.spines.values():
spine.set_color(BORDER)
ax2.grid(color=BORDER, linestyle='--',
linewidth=0.5, axis='y')
ax2.yaxis.set_major_formatter(
mticker.FuncFormatter(lambda x, _: f'{x/1000:.0f}K')
)
plt.tight_layout()
plt.show()
```
### Delivery Performance Findings and Recommendations
**Recommendations:**
- Audit the highest-risk shipping mode immediately,
identify whether the late delivery rate reflects
carrier performance or scheduling policy failures
- Set a target late delivery rate of under 20%
across all shipping modes within two quarters
- Implement real-time order tracking and customer
notification for all late-flagged shipments
---
## III. Product and Category Analysis
### Business Question
Which product categories and departments generate
the most revenue and profit? Are any categories
operating at a loss?
### Profitability by Department
```{python}
dept = df.groupby('Department Name').agg(
Orders = ('Order Id', 'nunique'),
Sales = ('Sales', 'sum'),
Profit = ('Order Profit Per Order', 'sum'),
AvgDiscount = ('Order Item Discount Rate', 'mean'),
AvgProfitRatio= ('Order Item Profit Ratio', 'mean')
).reset_index()
dept['ProfitMargin'] = (
dept['Profit'] / dept['Sales'] * 100
).round(2)
dept['AvgDiscount'] = (dept['AvgDiscount'] * 100).round(2)
dept['AvgProfitRatio'] = (dept['AvgProfitRatio'] * 100).round(2)
dept = dept.sort_values('Sales', ascending=False)
display(dept)
```
### Top 15 Categories by Revenue
```{python}
cat = df.groupby(['Department Name', 'Category Name']).agg(
Orders = ('Order Id', 'nunique'),
Sales = ('Sales', 'sum'),
Profit = ('Order Profit Per Order', 'sum')
).reset_index()
cat['ProfitMargin'] = (
cat['Profit'] / cat['Sales'] * 100
).round(2)
cat = cat.sort_values('Sales', ascending=False).head(15)
display(cat)
```
### Product Analysis Visualization
```{python}
fig, axes = plt.subplots(1, 2, figsize=(16, 7))
fig.patch.set_facecolor(BG)
fig.suptitle('Figure III: Product and Category Profitability',
color=NAVY, fontsize=13,
fontfamily='serif', fontweight='bold', y=1.02)
# Plot 1: Sales by department
ax1 = axes[0]
ax1.set_facecolor(WHITE)
d_colors = [TEAL if p > 0 else RED
for p in dept['Profit']]
bars = ax1.barh(range(len(dept)), dept['Sales'],
color=d_colors, edgecolor='none',
height=0.65)
ax1.set_yticks(range(len(dept)))
ax1.set_yticklabels(dept['Department Name'],
color=MUTED, fontsize=9)
ax1.set_xlabel('Total Sales (USD)', color=MUTED)
ax1.set_title('Sales by Department',
color=NAVY, fontweight='bold',
pad=12, fontfamily='serif')
ax1.tick_params(colors=MUTED)
for spine in ax1.spines.values():
spine.set_color(BORDER)
ax1.grid(color=BORDER, linestyle='--',
linewidth=0.5, axis='x')
ax1.invert_yaxis()
ax1.xaxis.set_major_formatter(
mticker.FuncFormatter(lambda x, _: f'${x/1e6:.1f}M')
)
# Plot 2: Profit margin by department
ax2 = axes[1]
ax2.set_facecolor(WHITE)
pm_colors = [RED if p < 0 else TEAL
for p in dept['ProfitMargin']]
bars2 = ax2.barh(range(len(dept)),
dept['ProfitMargin'],
color=pm_colors, edgecolor='none',
height=0.65)
ax2.set_yticks(range(len(dept)))
ax2.set_yticklabels(dept['Department Name'],
color=MUTED, fontsize=9)
ax2.set_xlabel('Profit Margin (%)', color=MUTED)
ax2.set_title('Profit Margin by Department',
color=NAVY, fontweight='bold',
pad=12, fontfamily='serif')
ax2.tick_params(colors=MUTED)
for spine in ax2.spines.values():
spine.set_color(BORDER)
ax2.grid(color=BORDER, linestyle='--',
linewidth=0.5, axis='x')
ax2.invert_yaxis()
ax2.axvline(x=0, color=MUTED, linewidth=0.8)
plt.tight_layout()
plt.show()
```
### Product Findings and Recommendations
**Recommendations:**
- Immediately review any department or category
operating at a negative profit margin, identify
whether the loss is driven by discounting,
pricing, or operational costs
- Consider discontinuing or repricing the
lowest-margin categories
- Redirect marketing investment toward the
highest-margin categories to improve overall
portfolio profitability
---
## IV. Discount Impact Analysis
### Business Question
How do discount rates affect profitability across
customer segments? Is discount abuse a material
risk to DataCo Global's bottom line?
### Discount and Profitability by Segment
```{python}
disc = df.groupby('Customer Segment').agg(
Orders = ('Order Id', 'nunique'),
Sales = ('Sales', 'sum'),
Profit = ('Order Profit Per Order', 'sum'),
TotalDiscount = ('Order Item Discount', 'sum'),
AvgDiscount = ('Order Item Discount Rate', 'mean'),
AvgProfitRatio= ('Order Item Profit Ratio', 'mean')
).reset_index()
disc['ProfitMargin'] = (disc['Profit'] / disc['Sales'] * 100).round(2)
disc['AvgDiscount'] = (disc['AvgDiscount'] * 100).round(2)
disc['AvgProfitRatio'] = (disc['AvgProfitRatio'] * 100).round(2)
display(disc)
```
### Discount Rate Bands vs Profit
```{python}
df['DiscountBand'] = pd.cut(
df['Order Item Discount Rate'],
bins=[-0.01, 0, 0.05, 0.10, 0.15, 0.20, 1.0],
labels=['No Discount', '0-5%', '5-10%',
'10-15%', '15-20%', 'Over 20%']
)
disc_band = df.groupby('DiscountBand', observed=True).agg(
Orders = ('Order Id', 'count'),
AvgProfit = ('Order Profit Per Order', 'mean'),
TotalProfit = ('Order Profit Per Order', 'sum'),
AvgProfitRatio= ('Order Item Profit Ratio', 'mean')
).reset_index()
disc_band['AvgProfitRatio'] = (
disc_band['AvgProfitRatio'] * 100
).round(2)
display(disc_band)
```
### Discount Impact Visualization
```{python}
fig, axes = plt.subplots(1, 2, figsize=(16, 6))
fig.patch.set_facecolor(BG)
fig.suptitle('Figure IV: Discount Impact on Profitability',
color=NAVY, fontsize=13,
fontfamily='serif', fontweight='bold', y=1.02)
# Plot 1: Avg profit by discount band
ax1 = axes[0]
ax1.set_facecolor(WHITE)
db_colors = [RED if p < 0 else TEAL
for p in disc_band['AvgProfit']]
bars = ax1.bar(range(len(disc_band)),
disc_band['AvgProfit'],
color=db_colors, edgecolor='none',
width=0.65)
ax1.set_xticks(range(len(disc_band)))
ax1.set_xticklabels(disc_band['DiscountBand'],
rotation=20, ha='right',
color=MUTED, fontsize=8)
ax1.set_ylabel('Average Profit per Order ($)', color=MUTED)
ax1.set_title('Avg Profit by Discount Band',
color=NAVY, fontweight='bold',
pad=12, fontfamily='serif')
ax1.tick_params(colors=MUTED)
for spine in ax1.spines.values():
spine.set_color(BORDER)
ax1.grid(color=BORDER, linestyle='--',
linewidth=0.5, axis='y')
ax1.axhline(y=0, color=MUTED, linewidth=0.8)
# Plot 2: Profit margin by customer segment
ax2 = axes[1]
ax2.set_facecolor(WHITE)
seg_colors = [TEAL, '#3a5a7c', AMBER]
bars2 = ax2.bar(disc['Customer Segment'],
disc['ProfitMargin'],
color=seg_colors[:len(disc)],
edgecolor='none', width=0.5)
ax2.set_ylabel('Profit Margin (%)', color=MUTED)
ax2.set_title('Profit Margin by Customer Segment',
color=NAVY, fontweight='bold',
pad=12, fontfamily='serif')
ax2.tick_params(colors=MUTED)
for spine in ax2.spines.values():
spine.set_color(BORDER)
ax2.grid(color=BORDER, linestyle='--',
linewidth=0.5, axis='y')
for bar, val in zip(bars2, disc['ProfitMargin']):
ax2.text(bar.get_x() + bar.get_width()/2,
bar.get_height() + 0.1,
f'{val:.1f}%',
ha='center', color=NAVY,
fontsize=10, fontweight='bold')
plt.tight_layout()
plt.show()
```
### Discount Findings and Recommendations
**Recommendations:**
- Implement a discount approval threshold,
no order should receive a discount above 15%
without manager sign-off
- Audit orders in the highest discount bands
for patterns of discount abuse by customer or
sales representative
- Review discount policy by segment, ensure
discounts are targeted at acquisition and
retention, not habitual price reductions
---
## V. Customer Intelligence
### Business Question
Who are DataCo Global's highest-value customers?
How does customer value vary by segment and geography?
### Segment Performance
```{python}
seg = df.groupby('Customer Segment').agg(
Customers = ('Customer Id', 'nunique'),
Orders = ('Order Id', 'nunique'),
Sales = ('Sales', 'sum'),
Profit = ('Order Profit Per Order', 'sum'),
AvgOrderVal = ('Sales', 'mean')
).reset_index()
seg['OrdersPerCustomer'] = (
seg['Orders'] / seg['Customers']
).round(2)
seg['ProfitMargin'] = (
seg['Profit'] / seg['Sales'] * 100
).round(2)
display(seg)
```
### Top 20 Customers by Revenue
```{python}
top_cust = df.groupby([
'Customer Id', 'Customer Fname',
'Customer Segment', 'Customer Country'
]).agg(
Orders = ('Order Id', 'nunique'),
Sales = ('Sales', 'sum'),
Profit = ('Order Profit Per Order', 'sum'),
AvgOrderVal = ('Sales', 'mean')
).reset_index().sort_values('Sales', ascending=False).head(20)
display(top_cust)
```
### Customer Intelligence Visualization
```{python}
fig, axes = plt.subplots(1, 2, figsize=(16, 6))
fig.patch.set_facecolor(BG)
fig.suptitle('Figure V: Customer Intelligence',
color=NAVY, fontsize=13,
fontfamily='serif', fontweight='bold', y=1.02)
# Plot 1: Sales by segment
ax1 = axes[0]
ax1.set_facecolor(WHITE)
seg_colors = [TEAL, '#3a5a7c', AMBER]
bars = ax1.bar(seg['Customer Segment'], seg['Sales'],
color=seg_colors[:len(seg)],
edgecolor='none', width=0.5)
ax1.set_ylabel('Total Sales (USD)', color=MUTED)
ax1.set_title('Total Sales by Customer Segment',
color=NAVY, fontweight='bold',
pad=12, fontfamily='serif')
ax1.tick_params(colors=MUTED)
for spine in ax1.spines.values():
spine.set_color(BORDER)
ax1.grid(color=BORDER, linestyle='--',
linewidth=0.5, axis='y')
ax1.yaxis.set_major_formatter(
mticker.FuncFormatter(lambda x, _: f'${x/1e6:.1f}M')
)
# Plot 2: Top 10 customers by sales
ax2 = axes[1]
ax2.set_facecolor(WHITE)
top10 = top_cust.head(10)
bar_colors = [TEAL if i == 0 else '#3a5a7c'
for i in range(len(top10))]
bars2 = ax2.barh(
range(len(top10)), top10['Sales'],
color=bar_colors, edgecolor='none', height=0.65
)
ax2.set_yticks(range(len(top10)))
ax2.set_yticklabels(
top10['Customer Fname'], color=MUTED, fontsize=9
)
ax2.set_xlabel('Total Sales (USD)', color=MUTED)
ax2.set_title('Top 10 Customers by Revenue',
color=NAVY, fontweight='bold',
pad=12, fontfamily='serif')
ax2.tick_params(colors=MUTED)
for spine in ax2.spines.values():
spine.set_color(BORDER)
ax2.grid(color=BORDER, linestyle='--',
linewidth=0.5, axis='x')
ax2.invert_yaxis()
plt.tight_layout()
plt.show()
```
### Customer Intelligence Findings and Recommendations
**Recommendations:**
- Assign dedicated account management to the
top 20 customers, they represent a
disproportionate share of portfolio revenue
- Design a customer loyalty programme for the
Consumer segment, the largest by volume
but potentially the least sticky
- Review geographic concentration of top customers
and develop targeted retention campaigns
for high-value regions
---
## VI. Shipping Days and Profitability
### Business Question
Is there a measurable relationship between scheduled
versus actual shipping days and order profitability?
Do delays translate directly into profit losses?
### Delay Band Analysis
```{python}
df['DelayBand'] = pd.cut(
df['Delivery Delay'],
bins=[-99, -3, -1, 0, 2, 5, 99],
labels=['Early >3d', 'Early 1-3d', 'On Time',
'Late 1-2d', 'Late 3-5d', 'Late >5d']
)
delay = df.groupby('DelayBand', observed=True).agg(
Orders = ('Order Id', 'count'),
AvgProfit = ('Order Profit Per Order', 'mean'),
TotalProfit= ('Order Profit Per Order', 'sum'),
LateRate = ('Is Late', 'mean')
).reset_index()
delay['LateRate'] = (delay['LateRate'] * 100).round(2)
display(delay)
```
### Scheduled vs Actual Days Summary
```{python}
sched = df.groupby('Days for shipment (scheduled)').agg(
Orders = ('Order Id', 'count'),
AvgActual = ('Days for shipping (real)', 'mean'),
AvgDelay = ('Delivery Delay', 'mean'),
AvgProfit = ('Order Profit Per Order', 'mean'),
LateRate = ('Is Late', 'mean')
).reset_index()
sched['LateRate'] = (sched['LateRate'] * 100).round(2)
display(sched)
```
### Shipping vs Profit Visualization
```{python}
fig, axes = plt.subplots(1, 2, figsize=(16, 6))
fig.patch.set_facecolor(BG)
fig.suptitle('Figure VI: Shipping Days and Profitability',
color=NAVY, fontsize=13,
fontfamily='serif', fontweight='bold', y=1.02)
# Plot 1: Avg profit by delay band
ax1 = axes[0]
ax1.set_facecolor(WHITE)
db_colors = [RED if p < 0 else TEAL
for p in delay['AvgProfit']]
bars = ax1.bar(range(len(delay)), delay['AvgProfit'],
color=db_colors, edgecolor='none',
width=0.65)
ax1.set_xticks(range(len(delay)))
ax1.set_xticklabels(delay['DelayBand'].astype(str),
rotation=20, ha='right',
color=MUTED, fontsize=8)
ax1.set_ylabel('Average Profit per Order ($)', color=MUTED)
ax1.set_title('Avg Profit by Delivery Delay Band',
color=NAVY, fontweight='bold',
pad=12, fontfamily='serif')
ax1.tick_params(colors=MUTED)
for spine in ax1.spines.values():
spine.set_color(BORDER)
ax1.grid(color=BORDER, linestyle='--',
linewidth=0.5, axis='y')
ax1.axhline(y=0, color=MUTED, linewidth=0.8)
# Plot 2: Order volume by delay band
ax2 = axes[1]
ax2.set_facecolor(WHITE)
v_colors = [RED if 'Late' in str(b) else TEAL
for b in delay['DelayBand']]
bars2 = ax2.bar(range(len(delay)), delay['Orders'],
color=v_colors, edgecolor='none',
width=0.65)
ax2.set_xticks(range(len(delay)))
ax2.set_xticklabels(delay['DelayBand'].astype(str),
rotation=20, ha='right',
color=MUTED, fontsize=8)
ax2.set_ylabel('Number of Orders', color=MUTED)
ax2.set_title('Order Volume by Delay Band',
color=NAVY, fontweight='bold',
pad=12, fontfamily='serif')
ax2.tick_params(colors=MUTED)
for spine in ax2.spines.values():
spine.set_color(BORDER)
ax2.grid(color=BORDER, linestyle='--',
linewidth=0.5, axis='y')
ax2.yaxis.set_major_formatter(
mticker.FuncFormatter(lambda x, _: f'{x/1000:.0f}K')
)
plt.tight_layout()
plt.show()
```
### Shipping Findings and Recommendations
**Recommendations:**
- Investigate whether late deliveries are correlated
with lower profit, if so, the true cost of
late delivery is higher than the carrier penalty alone
- Review scheduling accuracy by shipping mode,
carriers consistently missing scheduled days
should face contract renegotiation
- Build a delivery performance KPI into
carrier contract renewals
---
## VII. Market Risk and Loss Analysis
### Business Question
Which markets and regions carry the highest
concentration of loss-making orders? Where is
DataCo Global most financially exposed?
### Loss Analysis by Market
```{python}
risk = df.groupby('Market').agg(
Orders = ('Order Id', 'nunique'),
Sales = ('Sales', 'sum'),
Profit = ('Order Profit Per Order', 'sum'),
LossOrders = ('Order Profit Per Order',
lambda x: (x < 0).sum()),
LossValue = ('Order Profit Per Order',
lambda x: x[x < 0].sum()),
LateRate = ('Is Late', 'mean')
).reset_index()
risk['ProfitMargin'] = (
risk['Profit'] / risk['Sales'] * 100
).round(2)
risk['LossOrderPct'] = (
risk['LossOrders'] / risk['Orders'] * 100
).round(2)
risk['LateRate'] = (risk['LateRate'] * 100).round(2)
risk = risk.sort_values('LossValue')
display(risk)
```
### Order Status and Cancellations
```{python}
ord_status = df.groupby('Order Status').agg(
Orders = ('Order Id', 'nunique'),
Sales = ('Sales', 'sum'),
Profit = ('Order Profit Per Order', 'sum')
).reset_index().sort_values('Orders', ascending=False)
ord_status['ProfitMargin'] = (
ord_status['Profit'] / ord_status['Sales'] * 100
).round(2)
display(ord_status)
```
### Market Risk Visualization
```{python}
fig, axes = plt.subplots(1, 2, figsize=(16, 6))
fig.patch.set_facecolor(BG)
fig.suptitle('Figure VII: Market Risk and Loss Exposure',
color=NAVY, fontsize=13,
fontfamily='serif', fontweight='bold', y=1.02)
# Plot 1: Loss value by market
ax1 = axes[0]
ax1.set_facecolor(WHITE)
loss_colors = [RED if lv < 0 else TEAL
for lv in risk['LossValue']]
bars = ax1.barh(range(len(risk)), risk['LossValue'],
color=RED, edgecolor='none', height=0.65)
ax1.set_yticks(range(len(risk)))
ax1.set_yticklabels(risk['Market'],
color=MUTED, fontsize=9)
ax1.set_xlabel('Total Loss Value (USD)', color=MUTED)
ax1.set_title('Total Loss Value by Market',
color=NAVY, fontweight='bold',
pad=12, fontfamily='serif')
ax1.tick_params(colors=MUTED)
for spine in ax1.spines.values():
spine.set_color(BORDER)
ax1.grid(color=BORDER, linestyle='--',
linewidth=0.5, axis='x')
ax1.invert_yaxis()
ax1.xaxis.set_major_formatter(
mticker.FuncFormatter(lambda x, _: f'${x/1000:.0f}K')
)
# Plot 2: Loss order percentage by market
ax2 = axes[1]
ax2.set_facecolor(WHITE)
lp_colors = [RED if r == risk['LossOrderPct'].max()
else '#3a5a7c'
for r in risk['LossOrderPct']]
bars2 = ax2.barh(range(len(risk)),
risk['LossOrderPct'],
color=lp_colors, edgecolor='none',
height=0.65)
ax2.set_yticks(range(len(risk)))
ax2.set_yticklabels(risk['Market'],
color=MUTED, fontsize=9)
ax2.set_xlabel('Loss Order % of Total', color=MUTED)
ax2.set_title('Loss Order Rate by Market',
color=NAVY, fontweight='bold',
pad=12, fontfamily='serif')
ax2.tick_params(colors=MUTED)
for spine in ax2.spines.values():
spine.set_color(BORDER)
ax2.grid(color=BORDER, linestyle='--',
linewidth=0.5, axis='x')
ax2.invert_yaxis()
plt.tight_layout()
plt.show()
```
### Market Risk Findings and Recommendations
**Recommendations:**
- Conduct an immediate deep-dive into the
highest-loss markets, determine whether
losses are structural (pricing, costs)
or operational (discounts, returns)
- Set a maximum acceptable loss order rate
per market, markets exceeding this threshold
trigger a mandatory profitability review
- Review CANCELLED and SUSPECTED_FRAUD
order statuses for patterns,
these may represent revenue recovery opportunities
:::
---
# Excel Workbook
## Workbook Structure
The analysis above has been packaged into a professional
Excel workbook with eight structured sheets for
operational use across DataCo Global.
### Dashboard Tab
<div class="excel-screenshot">
<img src="screenshots/sheet1_dashboard.png"
alt="Excel Dashboard DataCo Global"/>
<div class="excel-label">Sheet 1 Executive Dashboard · DataCo Global Supply Chain Analytics</div>
</div>
### Sales and Profit Analysis
<div class="excel-screenshot">
<img src="screenshots/sheet2_sales.png"
alt="Excel Sales and Profit Analysis"/>
<div class="excel-label">Sheet 2 Sales and Profit by Market and Region</div>
</div>
### Delivery Performance
<div class="excel-screenshot">
<img src="screenshots/sheet3_delivery.png"
alt="Excel Delivery Performance"/>
<div class="excel-label">Sheet 3 Delivery Performance and Late Delivery Risk</div>
</div>
### Product Analysis
<div class="excel-screenshot">
<img src="screenshots/sheet4_products.png"
alt="Excel Product Analysis"/>
<div class="excel-label">Sheet 4 Product Category and Department Analysis</div>
</div>
### Discount and Customer Analysis
<div class="excel-screenshot">
<img src="screenshots/sheet5_customers.png"
alt="Excel Discount and Customer Analysis"/>
<div class="excel-label">Sheet 5 Discount Impact and Customer Intelligence</div>
</div>
---
# Conclusions
## Summary of Findings
| Theme | Key Finding | Priority |
|-------|-------------|----------|
| Sales by Market | Revenue concentrated in identifiable markets | Baseline |
| Delivery Performance | Late delivery rate is operationally significant | Critical |
| Product Profitability | Some categories operating at negative margin | High |
| Discount Impact | High discounts correlate with lower profit | High |
| Customer Intelligence | Top customers represent concentrated revenue risk | High |
| Shipping vs Profit | Delivery delays show measurable profit impact | Medium |
| Market Risk | Loss-making orders concentrated in specific markets | Critical |
## The Three Most Urgent Actions
**1. Address Late Delivery Rate**
Late deliveries are not just a customer satisfaction
issue, they are a profitability issue. The data shows
a measurable relationship between delivery performance
and order profit. Every late delivery that could have
been prevented is profit that was surrendered.
**2. Review Discount Policy Immediately**
High discount rates are destroying margin. The data
is unambiguous, as discount rates increase, profit
per order falls sharply and in some bands turns negative.
DataCo Global is discounting itself into losses.
**3. Investigate Loss-Making Markets**
The markets with the highest concentration of
loss-making orders require immediate operational
review. These are not random, loss concentration
has a cause, and that cause is identifiable and fixable.
## Data Quality Notes
- `Product Description` was entirely missing, dropped
- `Order Zipcode` was 86% missing dropped
- Customer personal details were redacted,
analysis uses Customer Id for anonymised profiling
- Date parsing was applied to both order and
shipping date columns
## Final Note to Marcus Adeyemi
DataCo Global's supply chain data tells a clear story.
The revenue is there. The customers are there.
The products are there. What is eroding the
bottom line is a combination of late deliveries,
discount abuse, and loss concentration in
identifiable markets. None of these are
uncontrollable forces. All three are
operational decisions that can be changed.
The intelligence is here. The decisions are yours.
---
**Report prepared by:** Freda Erinmwingbovo, Data Analyst
**Prepared for:** Marcus Adeyemi, CEO DataCo Global
**Date:** February 2026
**Tools:** Python · pandas · matplotlib · Excel · Quarto
**Dataset:** DataCo Supply Chain Dataset · 180,519 orders · 53 attributes
*This report was produced to professional data science standards.
All findings are reproducible from the code cells above.
No results have been overstated or adjusted.*