Supply Chain Analytics Operations Intelligence and Risk Report

Author

Freda Erinmwingbovo, Data Analyst

Published

March 3, 2026

DataCo Global: Supply Chain Analytics Report

Prepared by: Freda Erinmwingbovo, Data Analyst
Prepared for: Marcus Adeyemi, Chief Executive Officer DataCo Global
Date: February 2026
Dataset: 180,519 orders · 53 attributes · Global supply chain operations


Executive Summary

DataCo Global processed 180,519 orders across multiple global markets. This report investigates seven operational questions covering sales performance, delivery risk, product profitability, discount abuse, customer value, shipping efficiency, and market loss exposure.

180,519
Total Orders
$36.8M
Total Sales
$3.97M
Total Profit
57.3%
Late Delivery Rate
10.2%
Avg Discount Rate
5
Markets Served
# Business Question Priority
1 Sales and profit by market and region Baseline
2 Shipping modes and late delivery risk Critical
3 Product category and department profitability High
4 Discount impact on profitability High
5 Highest-value customers and segments High
6 Scheduled vs actual shipping and profitability Medium
7 Market loss exposure and risk Critical

Data and Methodology

Data Loading and Preparation

Show Code
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import warnings
warnings.filterwarnings('ignore')

try:
    df = pd.read_csv('data_co_supply_chain.csv', encoding='utf-8')
except:
    df = pd.read_csv('data_co_supply_chain.csv', encoding='latin-1')

# Drop sensitive and empty columns
drop_cols = [
    'Customer Email', 'Customer Password', 'Product Description',
    'Product Image', 'Customer Street', 'Customer Zipcode',
    'Order Zipcode', 'Latitude', 'Longitude'
]
df = df.drop(columns=[c for c in drop_cols if c in df.columns])

# Parse dates
df['order date (DateOrders)'] = pd.to_datetime(
    df['order date (DateOrders)'], errors='coerce'
)
df['shipping date (DateOrders)'] = pd.to_datetime(
    df['shipping date (DateOrders)'], errors='coerce'
)

# Derived columns
df['Order Year']       = df['order date (DateOrders)'].dt.year
df['Order Month']      = df['order date (DateOrders)'].dt.month
df['Order Month Name'] = df['order date (DateOrders)'].dt.strftime('%b')
df['Delivery Delay']   = (
    df['Days for shipping (real)'] -
    df['Days for shipment (scheduled)']
)
df['Is Late'] = (df['Delivery Delay'] > 0).astype(int)

print(f"Dataset: {df.shape[0]:,} rows × {df.shape[1]} columns")
print(f"Date range: {df['order date (DateOrders)'].min().date()} "
      f"to {df['order date (DateOrders)'].max().date()}")
print(f"Markets: {df['Market'].nunique()}")
print(f"Regions: {df['Order Region'].nunique()}")
print(f"Products: {df['Product Name'].nunique()}")
print(f"Customers: {df['Customer Id'].nunique():,}")
Dataset: 180,519 rows × 49 columns
Date range: 2015-01-01 to 2018-01-31
Markets: 5
Regions: 23
Products: 118
Customers: 20,652

Portfolio Overview

Show Code
total_sales   = df['Sales'].sum()
total_profit  = df['Order Profit Per Order'].sum()
profit_margin = total_profit / total_sales * 100
late_rate     = df['Is Late'].mean() * 100
avg_discount  = df['Order Item Discount Rate'].mean() * 100

print("DATACO GLOBAL OPERATIONS OVERVIEW")
print("=" * 45)
print(f"Total orders:          {df['Order Id'].nunique():,}")
print(f"Total sales:           ${total_sales:,.2f}")
print(f"Total profit:          ${total_profit:,.2f}")
print(f"Profit margin:         {profit_margin:.1f}%")
print(f"Late delivery rate:    {late_rate:.1f}%")
print(f"Avg discount rate:     {avg_discount:.1f}%")
print(f"Markets:               {df['Market'].nunique()}")
print(f"Shipping modes:        {df['Shipping Mode'].nunique()}")
print(f"Order statuses:        {df['Order Status'].nunique()}")
DATACO GLOBAL OPERATIONS OVERVIEW
=============================================
Total orders:          65,752
Total sales:           $36,784,735.01
Total profit:          $3,966,902.97
Profit margin:         10.8%
Late delivery rate:    57.3%
Avg discount rate:     10.2%
Markets:               5
Shipping modes:        4
Order statuses:        9

Colour Palette

Show Code
NAVY   = '#1a3a5c'
TEAL   = '#1a6b5a'
AMBER  = '#d4722a'
RED    = '#c0392b'
GOLD   = '#c9a84c'
MUTED  = '#6a6a7a'
BG     = '#f8f7f4'
WHITE  = '#ffffff'
BORDER = '#e0dbd0'

Analysis

Business Question

How are DataCo Global’s sales, profit, and order volume distributed across global markets and regions?

Sales by Market

Show Code
mkt = df.groupby('Market').agg(
    Orders    = ('Order Id', 'nunique'),
    Sales     = ('Sales', 'sum'),
    Profit    = ('Order Profit Per Order', 'sum'),
    AvgOrder  = ('Sales', 'mean')
).reset_index()
mkt['ProfitMargin'] = (mkt['Profit'] / mkt['Sales'] * 100).round(2)
mkt = mkt.sort_values('Sales', ascending=False)
display(mkt)
Market Orders Sales Profit AvgOrder ProfitMargin
1 Europe 18561 1.087240e+07 1.169443e+06 216.357494 10.76
2 LATAM 17181 1.027761e+07 1.123322e+06 199.201706 10.93
3 Pacific Asia 17577 8.273744e+06 8.577534e+05 200.526993 10.37
4 USCA 8579 5.066529e+06 5.643138e+05 196.384694 11.14
0 Africa 3854 2.294453e+06 2.520712e+05 197.559232 10.99

Sales by Region (Top 15)

Show Code
reg = df.groupby('Order Region').agg(
    Orders  = ('Order Id', 'nunique'),
    Sales   = ('Sales', 'sum'),
    Profit  = ('Order Profit Per Order', 'sum')
).reset_index()
reg['ProfitMargin'] = (reg['Profit'] / reg['Sales'] * 100).round(2)
reg = reg.sort_values('Sales', ascending=False).head(15)
display(reg)
Order Region Orders Sales Profit ProfitMargin
22 Western Europe 10010 5.894381e+06 625446.080548 10.61
3 Central America 9396 5.665712e+06 616341.570651 10.88
12 South America 4979 2.960881e+06 335154.400817 11.32
10 Northern Europe 3716 2.155831e+06 233450.600647 10.83
17 Southern Europe 3543 2.047919e+06 230829.229883 11.27
11 Oceania 4362 2.016654e+06 201478.020484 9.99
15 Southeast Asia 4356 1.932496e+06 211342.819786 10.94
1 Caribbean 2806 1.651019e+06 171825.640024 10.41
21 West of USA 2667 1.571416e+06 164940.660455 10.50
13 South Asia 3335 1.553681e+06 165703.900124 10.67
7 Eastern Asia 3318 1.486401e+06 147368.010614 9.91
6 East of USA 2323 1.371112e+06 156263.300194 11.40
20 West Asia 2022 1.174672e+06 118815.410047 10.11
18 US Center 1935 1.151356e+06 131094.229875 11.39
14 South of USA 1345 7.857839e+05 88114.879898 11.21

Monthly Sales Trend

Show Code
monthly = df.groupby(['Order Year', 'Order Month',
                       'Order Month Name']).agg(
    Sales  = ('Sales', 'sum'),
    Profit = ('Order Profit Per Order', 'sum'),
    Orders = ('Order Id', 'nunique')
).reset_index().sort_values(['Order Year', 'Order Month'])
display(monthly)
Order Year Order Month Order Month Name Sales Profit Orders
0 2015 1 Jan 1.051590e+06 111660.740132 1787
1 2015 2 Feb 9.270099e+05 99140.660196 1585
2 2015 3 Mar 1.051254e+06 113778.210191 1781
3 2015 4 Apr 1.014463e+06 108083.679957 1710
4 2015 5 May 1.050478e+06 112147.900143 1776
5 2015 6 Jun 1.024006e+06 110147.160313 1725
6 2015 7 Jul 1.038081e+06 115624.059879 1763
7 2015 8 Aug 1.029495e+06 117979.770302 1762
8 2015 9 Sep 1.018339e+06 113467.940118 1706
9 2015 10 Oct 1.049154e+06 101757.870040 1775
10 2015 11 Nov 1.029120e+06 104427.360120 1729
11 2015 12 Dec 1.057841e+06 110641.549881 1805
12 2016 1 Jan 1.046308e+06 106780.950229 1772
13 2016 2 Feb 9.685428e+05 86809.490642 1650
14 2016 3 Mar 1.025853e+06 101259.990250 1764
15 2016 4 Apr 1.001212e+06 114976.619907 1706
16 2016 5 May 1.029400e+06 110716.120111 1763
17 2016 6 Jun 1.003060e+06 104196.070108 1690
18 2016 7 Jul 1.045716e+06 119941.720183 1758
19 2016 8 Aug 1.048200e+06 110729.540117 1766
20 2016 9 Sep 1.002397e+06 123384.709977 1730
21 2016 10 Oct 1.048131e+06 116781.970048 1776
22 2016 11 Nov 1.047590e+06 109539.199962 1721
23 2016 12 Dec 1.037408e+06 105002.689937 1763
24 2017 1 Jan 1.029698e+06 114843.850277 1745
25 2017 2 Feb 9.925349e+05 115111.070097 1614
26 2017 3 Mar 1.048005e+06 118688.900211 1782
27 2017 4 Apr 1.038322e+06 115961.060193 1739
28 2017 5 May 1.105485e+06 115014.640014 1763
29 2017 6 Jun 1.032086e+06 110399.290305 1676
30 2017 7 Jul 1.104373e+06 113026.700038 1776
31 2017 8 Aug 1.109337e+06 131501.160211 1768
32 2017 9 Sep 1.143775e+06 122462.390153 1723
33 2017 10 Oct 1.073994e+06 113447.169883 2101
34 2017 11 Nov 6.269144e+05 67791.250205 2055
35 2017 12 Dec 5.039108e+05 65837.629745 2124
36 2018 1 Jan 3.316501e+05 33841.889977 2123

Market and Sales Visualization

Show Code
fig, axes = plt.subplots(1, 2, figsize=(16, 6))
fig.patch.set_facecolor(BG)
fig.suptitle('Figure I: Sales and Profit by Market',
             color=NAVY, fontsize=13,
             fontfamily='serif', fontweight='bold', y=1.02)

# Plot 1: Sales by market
ax1 = axes[0]
ax1.set_facecolor(WHITE)
m_colors = [TEAL if i == 0 else '#3a5a7c'
            for i in range(len(mkt))]
bars = ax1.barh(range(len(mkt)), mkt['Sales'],
                color=m_colors, edgecolor='none', height=0.65)
ax1.set_yticks(range(len(mkt)))
ax1.set_yticklabels(mkt['Market'], color=MUTED, fontsize=9)
ax1.set_xlabel('Total Sales (USD)', color=MUTED)
ax1.set_title('Total Sales by Market',
              color=NAVY, fontweight='bold',
              pad=12, fontfamily='serif')
ax1.tick_params(colors=MUTED)
for spine in ax1.spines.values():
    spine.set_color(BORDER)
ax1.grid(color=BORDER, linestyle='--',
         linewidth=0.5, axis='x')
ax1.invert_yaxis()
ax1.xaxis.set_major_formatter(
    mticker.FuncFormatter(lambda x, _: f'${x/1e6:.1f}M')
)

# Plot 2: Profit margin by market
ax2 = axes[1]
ax2.set_facecolor(WHITE)
pm_colors = [RED if p < 0 else TEAL
             for p in mkt['ProfitMargin']]
bars2 = ax2.barh(range(len(mkt)), mkt['ProfitMargin'],
                 color=pm_colors, edgecolor='none', height=0.65)
ax2.set_yticks(range(len(mkt)))
ax2.set_yticklabels(mkt['Market'], color=MUTED, fontsize=9)
ax2.set_xlabel('Profit Margin (%)', color=MUTED)
ax2.set_title('Profit Margin by Market',
              color=NAVY, fontweight='bold',
              pad=12, fontfamily='serif')
ax2.tick_params(colors=MUTED)
for spine in ax2.spines.values():
    spine.set_color(BORDER)
ax2.grid(color=BORDER, linestyle='--',
         linewidth=0.5, axis='x')
ax2.invert_yaxis()
ax2.axvline(x=0, color=MUTED, linewidth=0.8,
            linestyle='-')

plt.tight_layout()
plt.show()

Market Findings and Recommendations

Recommendations:

  • Prioritise investment and marketing spend in the highest-revenue markets to defend market position
  • Investigate markets with below-average profit margins for pricing, discount, or operational inefficiencies
  • Set quarterly revenue targets by market with escalation triggers if performance falls below plan

Business Question

Which shipping modes and markets carry the highest late delivery risk? What is the financial impact of late deliveries?

Delivery Performance by Shipping Mode

Show Code
ship = df.groupby('Shipping Mode').agg(
    Orders     = ('Order Id', 'nunique'),
    AvgReal    = ('Days for shipping (real)', 'mean'),
    AvgSched   = ('Days for shipment (scheduled)', 'mean'),
    AvgDelay   = ('Delivery Delay', 'mean'),
    LateCount  = ('Is Late', 'sum'),
    LateRate   = ('Is Late', 'mean'),
    Sales      = ('Sales', 'sum'),
    Profit     = ('Order Profit Per Order', 'sum')
).reset_index()
ship['LateRate'] = (ship['LateRate'] * 100).round(2)
ship = ship.sort_values('LateRate', ascending=False)
display(ship)
Shipping Mode Orders AvgReal AvgSched AvgDelay LateCount LateRate Sales Profit
0 First Class 10079 2.000000 1.0 1.000000 27814 100.00 5.674370e+06 6.431219e+05
2 Second Class 12778 3.990828 2.0 1.990828 28078 79.73 7.145445e+06 7.503082e+05
1 Same Day 3571 0.478279 0.0 0.478279 4657 47.83 1.942529e+06 2.030184e+05
3 Standard Class 39324 3.995907 4.0 -0.004093 42851 39.77 2.202239e+07 2.370454e+06

Late Delivery by Market

Show Code
late_mkt = df.groupby('Market').agg(
    Orders    = ('Order Id', 'nunique'),
    Late      = ('Is Late', 'sum'),
    LateRate  = ('Is Late', 'mean')
).reset_index()
late_mkt['LateRate'] = (late_mkt['LateRate'] * 100).round(2)
late_mkt = late_mkt.sort_values('LateRate', ascending=False)
display(late_mkt)
Market Orders Late LateRate
1 Europe 18561 28989 57.69
3 Pacific Asia 17577 23649 57.32
4 USCA 8579 14744 57.15
2 LATAM 17181 29420 57.02
0 Africa 3854 6598 56.81

Delivery Status Distribution

Show Code
del_stat = df.groupby('Delivery Status').agg(
    Orders = ('Order Id', 'nunique'),
    Sales  = ('Sales', 'sum'),
    Profit = ('Order Profit Per Order', 'sum')
).reset_index().sort_values('Orders', ascending=False)
del_stat['ProfitMargin'] = (
    del_stat['Profit'] / del_stat['Sales'] * 100
).round(2)
display(del_stat)
Delivery Status Orders Sales Profit ProfitMargin
1 Late delivery 36048 2.012640e+07 2.140052e+06 10.63
0 Advance shipping 15127 8.518008e+06 9.352253e+05 10.98
3 Shipping on time 11722 6.570026e+06 7.311437e+05 11.13
2 Shipping canceled 2855 1.570305e+06 1.604823e+05 10.22

Delivery Performance Visualization

Show Code
fig, axes = plt.subplots(1, 2, figsize=(16, 6))
fig.patch.set_facecolor(BG)
fig.suptitle('Figure II: Delivery Performance Analysis',
             color=NAVY, fontsize=13,
             fontfamily='serif', fontweight='bold', y=1.02)

# Plot 1: Late rate by shipping mode
ax1 = axes[0]
ax1.set_facecolor(WHITE)
s_colors = [RED if r == ship['LateRate'].max()
            else TEAL for r in ship['LateRate']]
bars = ax1.bar(ship['Shipping Mode'], ship['LateRate'],
               color=s_colors, edgecolor='none', width=0.55)
ax1.set_ylabel('Late Delivery Rate (%)', color=MUTED)
ax1.set_title('Late Rate by Shipping Mode',
              color=NAVY, fontweight='bold',
              pad=12, fontfamily='serif')
ax1.tick_params(colors=MUTED, labelsize=8)
for spine in ax1.spines.values():
    spine.set_color(BORDER)
ax1.grid(color=BORDER, linestyle='--',
         linewidth=0.5, axis='y')
ax1.set_xticklabels(ship['Shipping Mode'],
                    rotation=15, ha='right')
for bar, val in zip(bars, ship['LateRate']):
    ax1.text(bar.get_x() + bar.get_width()/2,
             bar.get_height() + 0.3,
             f'{val:.1f}%',
             ha='center', color=NAVY,
             fontsize=9, fontweight='bold')

# Plot 2: Delivery status distribution
ax2 = axes[1]
ax2.set_facecolor(WHITE)
d_colors = [RED if 'Late' in s else
            TEAL if 'Advance' in s or 'time' in s
            else AMBER
            for s in del_stat['Delivery Status']]
bars2 = ax2.bar(range(len(del_stat)),
                del_stat['Orders'],
                color=d_colors, edgecolor='none',
                width=0.6)
ax2.set_xticks(range(len(del_stat)))
ax2.set_xticklabels(del_stat['Delivery Status'],
                    rotation=20, ha='right',
                    color=MUTED, fontsize=8)
ax2.set_ylabel('Number of Orders', color=MUTED)
ax2.set_title('Orders by Delivery Status',
              color=NAVY, fontweight='bold',
              pad=12, fontfamily='serif')
ax2.tick_params(colors=MUTED)
for spine in ax2.spines.values():
    spine.set_color(BORDER)
ax2.grid(color=BORDER, linestyle='--',
         linewidth=0.5, axis='y')
ax2.yaxis.set_major_formatter(
    mticker.FuncFormatter(lambda x, _: f'{x/1000:.0f}K')
)

plt.tight_layout()
plt.show()

Delivery Performance Findings and Recommendations

Recommendations:

  • Audit the highest-risk shipping mode immediately, identify whether the late delivery rate reflects carrier performance or scheduling policy failures
  • Set a target late delivery rate of under 20% across all shipping modes within two quarters
  • Implement real-time order tracking and customer notification for all late-flagged shipments

Business Question

Which product categories and departments generate the most revenue and profit? Are any categories operating at a loss?

Profitability by Department

Show Code
dept = df.groupby('Department Name').agg(
    Orders        = ('Order Id', 'nunique'),
    Sales         = ('Sales', 'sum'),
    Profit        = ('Order Profit Per Order', 'sum'),
    AvgDiscount   = ('Order Item Discount Rate', 'mean'),
    AvgProfitRatio= ('Order Item Profit Ratio', 'mean')
).reset_index()
dept['ProfitMargin'] = (
    dept['Profit'] / dept['Sales'] * 100
).round(2)
dept['AvgDiscount']    = (dept['AvgDiscount'] * 100).round(2)
dept['AvgProfitRatio'] = (dept['AvgProfitRatio'] * 100).round(2)
dept = dept.sort_values('Sales', ascending=False)
display(dept)
Department Name Orders Sales Profit AvgDiscount AvgProfitRatio ProfitMargin
3 Fan Shop 41303 1.711387e+07 1.834155e+06 10.16 12.00 10.72
0 Apparel 35190 7.976255e+06 8.818829e+05 10.17 12.28 11.06
6 Golf 25889 4.609028e+06 4.975236e+05 10.16 11.88 10.79
5 Footwear 13009 4.006499e+06 4.102225e+05 10.17 11.92 10.24
8 Outdoors 9066 1.253351e+06 1.452515e+05 10.21 12.58 11.59
10 Technology 1465 1.039599e+06 1.131700e+05 10.18 12.58 10.89
4 Fitness 2437 3.970509e+05 4.653806e+04 10.16 13.12 11.72
2 Discs Shop 2026 2.288877e+05 2.419312e+04 10.18 10.97 10.57
7 Health and Beauty 362 1.060805e+05 9.493630e+03 10.14 9.56 8.95
9 Pet Shop 492 4.152480e+04 3.589260e+03 10.16 9.41 8.64
1 Book Shop 405 1.258740e+04 8.830100e+02 10.23 7.91 7.02

Top 15 Categories by Revenue

Show Code
cat = df.groupby(['Department Name', 'Category Name']).agg(
    Orders  = ('Order Id', 'nunique'),
    Sales   = ('Sales', 'sum'),
    Profit  = ('Order Profit Per Order', 'sum')
).reset_index()
cat['ProfitMargin'] = (
    cat['Profit'] / cat['Sales'] * 100
).round(2)
cat = cat.sort_values('Sales', ascending=False).head(15)
display(cat)
Department Name Category Name Orders Sales Profit ProfitMargin
13 Fan Shop Fishing 15164 6.929654e+06 756220.767190 10.91
2 Apparel Cleats 20386 4.431943e+06 494636.919791 11.16
12 Fan Shop Camping & Hiking 12299 4.118426e+06 427455.568106 10.38
27 Footwear Cardio Equipment 11355 3.694843e+06 383011.098485 10.37
33 Golf Women's Apparel 17869 3.147800e+06 350421.029567 11.13
17 Fan Shop Water Sports 13758 3.113845e+06 325146.960038 10.44
5 Apparel Men's Footwear 18783 2.891758e+06 311902.820214 10.79
15 Fan Shop Indoor/Outdoor Games 16623 2.888994e+06 318451.430554 11.02
32 Golf Shop By Sport 10136 1.309522e+06 129813.960315 9.91
49 Technology Computers 442 6.630000e+05 69656.810171 10.51
48 Technology Cameras 592 2.676077e+05 30289.799946 11.32
37 Outdoors Garden 484 2.577687e+05 33443.010106 12.97
36 Outdoors Electronics 1994 2.556794e+05 28626.479871 11.20
1 Apparel Children's Clothing 652 2.328292e+05 27178.099597 11.67
3 Apparel Crafts 484 2.233563e+05 25531.170060 11.43

Product Analysis Visualization

Show Code
fig, axes = plt.subplots(1, 2, figsize=(16, 7))
fig.patch.set_facecolor(BG)
fig.suptitle('Figure III: Product and Category Profitability',
             color=NAVY, fontsize=13,
             fontfamily='serif', fontweight='bold', y=1.02)

# Plot 1: Sales by department
ax1 = axes[0]
ax1.set_facecolor(WHITE)
d_colors = [TEAL if p > 0 else RED
            for p in dept['Profit']]
bars = ax1.barh(range(len(dept)), dept['Sales'],
                color=d_colors, edgecolor='none',
                height=0.65)
ax1.set_yticks(range(len(dept)))
ax1.set_yticklabels(dept['Department Name'],
                    color=MUTED, fontsize=9)
ax1.set_xlabel('Total Sales (USD)', color=MUTED)
ax1.set_title('Sales by Department',
              color=NAVY, fontweight='bold',
              pad=12, fontfamily='serif')
ax1.tick_params(colors=MUTED)
for spine in ax1.spines.values():
    spine.set_color(BORDER)
ax1.grid(color=BORDER, linestyle='--',
         linewidth=0.5, axis='x')
ax1.invert_yaxis()
ax1.xaxis.set_major_formatter(
    mticker.FuncFormatter(lambda x, _: f'${x/1e6:.1f}M')
)

# Plot 2: Profit margin by department
ax2 = axes[1]
ax2.set_facecolor(WHITE)
pm_colors = [RED if p < 0 else TEAL
             for p in dept['ProfitMargin']]
bars2 = ax2.barh(range(len(dept)),
                 dept['ProfitMargin'],
                 color=pm_colors, edgecolor='none',
                 height=0.65)
ax2.set_yticks(range(len(dept)))
ax2.set_yticklabels(dept['Department Name'],
                    color=MUTED, fontsize=9)
ax2.set_xlabel('Profit Margin (%)', color=MUTED)
ax2.set_title('Profit Margin by Department',
              color=NAVY, fontweight='bold',
              pad=12, fontfamily='serif')
ax2.tick_params(colors=MUTED)
for spine in ax2.spines.values():
    spine.set_color(BORDER)
ax2.grid(color=BORDER, linestyle='--',
         linewidth=0.5, axis='x')
ax2.invert_yaxis()
ax2.axvline(x=0, color=MUTED, linewidth=0.8)

plt.tight_layout()
plt.show()

Product Findings and Recommendations

Recommendations:

  • Immediately review any department or category operating at a negative profit margin, identify whether the loss is driven by discounting, pricing, or operational costs
  • Consider discontinuing or repricing the lowest-margin categories
  • Redirect marketing investment toward the highest-margin categories to improve overall portfolio profitability

Business Question

How do discount rates affect profitability across customer segments? Is discount abuse a material risk to DataCo Global’s bottom line?

Discount and Profitability by Segment

Show Code
disc = df.groupby('Customer Segment').agg(
    Orders        = ('Order Id', 'nunique'),
    Sales         = ('Sales', 'sum'),
    Profit        = ('Order Profit Per Order', 'sum'),
    TotalDiscount = ('Order Item Discount', 'sum'),
    AvgDiscount   = ('Order Item Discount Rate', 'mean'),
    AvgProfitRatio= ('Order Item Profit Ratio', 'mean')
).reset_index()
disc['ProfitMargin']   = (disc['Profit'] / disc['Sales'] * 100).round(2)
disc['AvgDiscount']    = (disc['AvgDiscount'] * 100).round(2)
disc['AvgProfitRatio'] = (disc['AvgProfitRatio'] * 100).round(2)
display(disc)
Customer Segment Orders Sales Profit TotalDiscount AvgDiscount AvgProfitRatio ProfitMargin
0 Consumer 34119 1.909579e+07 2.073488e+06 1.931536e+06 10.16 12.12 10.86
1 Corporate 19856 1.116841e+07 1.202575e+06 1.137771e+06 10.16 12.09 10.77
2 Home Office 11777 6.520538e+06 6.908403e+05 6.610720e+05 10.19 11.86 10.59

Discount Rate Bands vs Profit

Show Code
df['DiscountBand'] = pd.cut(
    df['Order Item Discount Rate'],
    bins=[-0.01, 0, 0.05, 0.10, 0.15, 0.20, 1.0],
    labels=['No Discount', '0-5%', '5-10%',
            '10-15%', '15-20%', 'Over 20%']
)
disc_band = df.groupby('DiscountBand', observed=True).agg(
    Orders        = ('Order Id', 'count'),
    AvgProfit     = ('Order Profit Per Order', 'mean'),
    TotalProfit   = ('Order Profit Per Order', 'sum'),
    AvgProfitRatio= ('Order Item Profit Ratio', 'mean')
).reset_index()
disc_band['AvgProfitRatio'] = (
    disc_band['AvgProfitRatio'] * 100
).round(2)
display(disc_band)
DiscountBand Orders AvgProfit TotalProfit AvgProfitRatio
0 No Discount 10028 26.666574 267412.402338 12.75
1 0-5% 40114 23.264044 933213.852532 11.96
2 5-10% 40116 23.508290 943058.541596 12.45
3 10-15% 30087 21.473808 646082.469403 11.89
4 15-20% 40116 20.139751 807926.249038 11.79
5 Over 20% 20058 18.407092 369209.459143 11.97

Discount Impact Visualization

Show Code
fig, axes = plt.subplots(1, 2, figsize=(16, 6))
fig.patch.set_facecolor(BG)
fig.suptitle('Figure IV: Discount Impact on Profitability',
             color=NAVY, fontsize=13,
             fontfamily='serif', fontweight='bold', y=1.02)

# Plot 1: Avg profit by discount band
ax1 = axes[0]
ax1.set_facecolor(WHITE)
db_colors = [RED if p < 0 else TEAL
             for p in disc_band['AvgProfit']]
bars = ax1.bar(range(len(disc_band)),
               disc_band['AvgProfit'],
               color=db_colors, edgecolor='none',
               width=0.65)
ax1.set_xticks(range(len(disc_band)))
ax1.set_xticklabels(disc_band['DiscountBand'],
                    rotation=20, ha='right',
                    color=MUTED, fontsize=8)
ax1.set_ylabel('Average Profit per Order ($)', color=MUTED)
ax1.set_title('Avg Profit by Discount Band',
              color=NAVY, fontweight='bold',
              pad=12, fontfamily='serif')
ax1.tick_params(colors=MUTED)
for spine in ax1.spines.values():
    spine.set_color(BORDER)
ax1.grid(color=BORDER, linestyle='--',
         linewidth=0.5, axis='y')
ax1.axhline(y=0, color=MUTED, linewidth=0.8)

# Plot 2: Profit margin by customer segment
ax2 = axes[1]
ax2.set_facecolor(WHITE)
seg_colors = [TEAL, '#3a5a7c', AMBER]
bars2 = ax2.bar(disc['Customer Segment'],
                disc['ProfitMargin'],
                color=seg_colors[:len(disc)],
                edgecolor='none', width=0.5)
ax2.set_ylabel('Profit Margin (%)', color=MUTED)
ax2.set_title('Profit Margin by Customer Segment',
              color=NAVY, fontweight='bold',
              pad=12, fontfamily='serif')
ax2.tick_params(colors=MUTED)
for spine in ax2.spines.values():
    spine.set_color(BORDER)
ax2.grid(color=BORDER, linestyle='--',
         linewidth=0.5, axis='y')
for bar, val in zip(bars2, disc['ProfitMargin']):
    ax2.text(bar.get_x() + bar.get_width()/2,
             bar.get_height() + 0.1,
             f'{val:.1f}%',
             ha='center', color=NAVY,
             fontsize=10, fontweight='bold')

plt.tight_layout()
plt.show()

Discount Findings and Recommendations

Recommendations:

  • Implement a discount approval threshold, no order should receive a discount above 15% without manager sign-off
  • Audit orders in the highest discount bands for patterns of discount abuse by customer or sales representative
  • Review discount policy by segment, ensure discounts are targeted at acquisition and retention, not habitual price reductions

Business Question

Who are DataCo Global’s highest-value customers? How does customer value vary by segment and geography?

Segment Performance

Show Code
seg = df.groupby('Customer Segment').agg(
    Customers   = ('Customer Id', 'nunique'),
    Orders      = ('Order Id', 'nunique'),
    Sales       = ('Sales', 'sum'),
    Profit      = ('Order Profit Per Order', 'sum'),
    AvgOrderVal = ('Sales', 'mean')
).reset_index()
seg['OrdersPerCustomer'] = (
    seg['Orders'] / seg['Customers']
).round(2)
seg['ProfitMargin'] = (
    seg['Profit'] / seg['Sales'] * 100
).round(2)
display(seg)
Customer Segment Customers Orders Sales Profit AvgOrderVal OrdersPerCustomer ProfitMargin
0 Consumer 10695 34119 1.909579e+07 2.073488e+06 204.224313 3.19 10.86
1 Corporate 6239 19856 1.116841e+07 1.202575e+06 203.843962 3.18 10.77
2 Home Office 3718 11777 6.520538e+06 6.908403e+05 202.337802 3.17 10.59

Top 20 Customers by Revenue

Show Code
top_cust = df.groupby([
    'Customer Id', 'Customer Fname',
    'Customer Segment', 'Customer Country'
]).agg(
    Orders      = ('Order Id', 'nunique'),
    Sales       = ('Sales', 'sum'),
    Profit      = ('Order Profit Per Order', 'sum'),
    AvgOrderVal = ('Sales', 'mean')
).reset_index().sort_values('Sales', ascending=False).head(20)
display(top_cust)
Customer Id Customer Fname Customer Segment Customer Country Orders Sales Profit AvgOrderVal
782 791 Mary Corporate EE. UU. 13 10524.170178 -866.380008 244.748144
9295 9371 Mary Consumer EE. UU. 11 9299.030207 1346.579992 211.341596
8694 8766 Mary Corporate Puerto Rico 11 9296.140186 1495.159984 244.635268
1636 1657 Betty Consumer Puerto Rico 11 9223.710152 2196.919992 219.612146
2612 2641 Betty Consumer EE. UU. 11 9130.920224 2441.970003 212.346982
1271 1288 Evelyn Home Office Puerto Rico 12 9019.110199 403.109983 214.740719
3672 3710 Ashley Consumer EE. UU. 14 9019.100208 1055.130010 214.740481
4209 4249 Mary Consumer Puerto Rico 13 8918.850134 439.709982 228.688465
5603 5654 Jerry Home Office Puerto Rico 15 8904.950211 1045.359995 189.467026
5573 5624 Mary Consumer Puerto Rico 12 8761.980183 1178.150009 230.578426
5664 5715 Kelly Corporate Puerto Rico 12 8595.130157 -177.449985 195.343867
1446 1464 Amber Consumer Puerto Rico 12 8409.220196 797.249997 200.219528
10155 10235 Joseph Corporate Puerto Rico 12 8404.220181 1383.079983 200.100480
656 664 Bobby Home Office EE. UU. 12 8394.260208 -68.450004 215.237441
10269 10351 Teresa Home Office Puerto Rico 7 8339.260166 696.619998 297.830720
2266 2292 Ashley Corporate EE. UU. 9 8293.040165 1329.220006 218.237899
4619 4663 Paul Corporate EE. UU. 12 8279.890162 881.039995 206.997254
3670 3708 Judy Consumer EE. UU. 12 8275.090179 1652.739974 188.070231
12182 12284 Mary Consumer EE. UU. 14 8258.620171 1462.459999 192.060934
7140 7200 Katherine Consumer Puerto Rico 11 8229.010204 1409.630016 235.114577

Customer Intelligence Visualization

Show Code
fig, axes = plt.subplots(1, 2, figsize=(16, 6))
fig.patch.set_facecolor(BG)
fig.suptitle('Figure V: Customer Intelligence',
             color=NAVY, fontsize=13,
             fontfamily='serif', fontweight='bold', y=1.02)

# Plot 1: Sales by segment
ax1 = axes[0]
ax1.set_facecolor(WHITE)
seg_colors = [TEAL, '#3a5a7c', AMBER]
bars = ax1.bar(seg['Customer Segment'], seg['Sales'],
               color=seg_colors[:len(seg)],
               edgecolor='none', width=0.5)
ax1.set_ylabel('Total Sales (USD)', color=MUTED)
ax1.set_title('Total Sales by Customer Segment',
              color=NAVY, fontweight='bold',
              pad=12, fontfamily='serif')
ax1.tick_params(colors=MUTED)
for spine in ax1.spines.values():
    spine.set_color(BORDER)
ax1.grid(color=BORDER, linestyle='--',
         linewidth=0.5, axis='y')
ax1.yaxis.set_major_formatter(
    mticker.FuncFormatter(lambda x, _: f'${x/1e6:.1f}M')
)

# Plot 2: Top 10 customers by sales
ax2 = axes[1]
ax2.set_facecolor(WHITE)
top10 = top_cust.head(10)
bar_colors = [TEAL if i == 0 else '#3a5a7c'
              for i in range(len(top10))]
bars2 = ax2.barh(
    range(len(top10)), top10['Sales'],
    color=bar_colors, edgecolor='none', height=0.65
)
ax2.set_yticks(range(len(top10)))
ax2.set_yticklabels(
    top10['Customer Fname'], color=MUTED, fontsize=9
)
ax2.set_xlabel('Total Sales (USD)', color=MUTED)
ax2.set_title('Top 10 Customers by Revenue',
              color=NAVY, fontweight='bold',
              pad=12, fontfamily='serif')
ax2.tick_params(colors=MUTED)
for spine in ax2.spines.values():
    spine.set_color(BORDER)
ax2.grid(color=BORDER, linestyle='--',
         linewidth=0.5, axis='x')
ax2.invert_yaxis()

plt.tight_layout()
plt.show()

Customer Intelligence Findings and Recommendations

Recommendations:

  • Assign dedicated account management to the top 20 customers, they represent a disproportionate share of portfolio revenue
  • Design a customer loyalty programme for the Consumer segment, the largest by volume but potentially the least sticky
  • Review geographic concentration of top customers and develop targeted retention campaigns for high-value regions

Business Question

Is there a measurable relationship between scheduled versus actual shipping days and order profitability? Do delays translate directly into profit losses?

Delay Band Analysis

Show Code
df['DelayBand'] = pd.cut(
    df['Delivery Delay'],
    bins=[-99, -3, -1, 0, 2, 5, 99],
    labels=['Early >3d', 'Early 1-3d', 'On Time',
            'Late 1-2d', 'Late 3-5d', 'Late >5d']
)
delay = df.groupby('DelayBand', observed=True).agg(
    Orders     = ('Order Id', 'count'),
    AvgProfit  = ('Order Profit Per Order', 'mean'),
    TotalProfit= ('Order Profit Per Order', 'sum'),
    LateRate   = ('Is Late', 'mean')
).reset_index()
delay['LateRate'] = (delay['LateRate'] * 100).round(2)
display(delay)
DelayBand Orders AvgProfit TotalProfit LateRate
0 Early 1-3d 43366 22.469515 9.744130e+05 0.0
1 On Time 33753 22.532021 7.605233e+05 0.0
2 Late 1-2d 89365 21.717450 1.940780e+06 100.0
3 Late 3-5d 14035 20.747183 2.911867e+05 100.0

Scheduled vs Actual Days Summary

Show Code
sched = df.groupby('Days for shipment (scheduled)').agg(
    Orders    = ('Order Id', 'count'),
    AvgActual = ('Days for shipping (real)', 'mean'),
    AvgDelay  = ('Delivery Delay', 'mean'),
    AvgProfit = ('Order Profit Per Order', 'mean'),
    LateRate  = ('Is Late', 'mean')
).reset_index()
sched['LateRate'] = (sched['LateRate'] * 100).round(2)
display(sched)
Days for shipment (scheduled) Orders AvgActual AvgDelay AvgProfit LateRate
0 0 9737 0.478279 0.478279 20.850203 47.83
1 1 27814 2.000000 1.000000 23.122238 100.00
2 2 35216 3.990828 1.990828 21.305889 79.73
3 4 107752 3.995907 -0.004093 21.999169 39.77

Shipping vs Profit Visualization

Show Code
fig, axes = plt.subplots(1, 2, figsize=(16, 6))
fig.patch.set_facecolor(BG)
fig.suptitle('Figure VI: Shipping Days and Profitability',
             color=NAVY, fontsize=13,
             fontfamily='serif', fontweight='bold', y=1.02)

# Plot 1: Avg profit by delay band
ax1 = axes[0]
ax1.set_facecolor(WHITE)
db_colors = [RED if p < 0 else TEAL
             for p in delay['AvgProfit']]
bars = ax1.bar(range(len(delay)), delay['AvgProfit'],
               color=db_colors, edgecolor='none',
               width=0.65)
ax1.set_xticks(range(len(delay)))
ax1.set_xticklabels(delay['DelayBand'].astype(str),
                    rotation=20, ha='right',
                    color=MUTED, fontsize=8)
ax1.set_ylabel('Average Profit per Order ($)', color=MUTED)
ax1.set_title('Avg Profit by Delivery Delay Band',
              color=NAVY, fontweight='bold',
              pad=12, fontfamily='serif')
ax1.tick_params(colors=MUTED)
for spine in ax1.spines.values():
    spine.set_color(BORDER)
ax1.grid(color=BORDER, linestyle='--',
         linewidth=0.5, axis='y')
ax1.axhline(y=0, color=MUTED, linewidth=0.8)

# Plot 2: Order volume by delay band
ax2 = axes[1]
ax2.set_facecolor(WHITE)
v_colors = [RED if 'Late' in str(b) else TEAL
            for b in delay['DelayBand']]
bars2 = ax2.bar(range(len(delay)), delay['Orders'],
                color=v_colors, edgecolor='none',
                width=0.65)
ax2.set_xticks(range(len(delay)))
ax2.set_xticklabels(delay['DelayBand'].astype(str),
                    rotation=20, ha='right',
                    color=MUTED, fontsize=8)
ax2.set_ylabel('Number of Orders', color=MUTED)
ax2.set_title('Order Volume by Delay Band',
              color=NAVY, fontweight='bold',
              pad=12, fontfamily='serif')
ax2.tick_params(colors=MUTED)
for spine in ax2.spines.values():
    spine.set_color(BORDER)
ax2.grid(color=BORDER, linestyle='--',
         linewidth=0.5, axis='y')
ax2.yaxis.set_major_formatter(
    mticker.FuncFormatter(lambda x, _: f'{x/1000:.0f}K')
)

plt.tight_layout()
plt.show()

Shipping Findings and Recommendations

Recommendations:

  • Investigate whether late deliveries are correlated with lower profit, if so, the true cost of late delivery is higher than the carrier penalty alone
  • Review scheduling accuracy by shipping mode, carriers consistently missing scheduled days should face contract renegotiation
  • Build a delivery performance KPI into carrier contract renewals

Business Question

Which markets and regions carry the highest concentration of loss-making orders? Where is DataCo Global most financially exposed?

Loss Analysis by Market

Show Code
risk = df.groupby('Market').agg(
    Orders     = ('Order Id', 'nunique'),
    Sales      = ('Sales', 'sum'),
    Profit     = ('Order Profit Per Order', 'sum'),
    LossOrders = ('Order Profit Per Order',
                  lambda x: (x < 0).sum()),
    LossValue  = ('Order Profit Per Order',
                  lambda x: x[x < 0].sum()),
    LateRate   = ('Is Late', 'mean')
).reset_index()
risk['ProfitMargin'] = (
    risk['Profit'] / risk['Sales'] * 100
).round(2)
risk['LossOrderPct'] = (
    risk['LossOrders'] / risk['Orders'] * 100
).round(2)
risk['LateRate'] = (risk['LateRate'] * 100).round(2)
risk = risk.sort_values('LossValue')
display(risk)
Market Orders Sales Profit LossOrders LossValue LateRate ProfitMargin LossOrderPct
1 Europe 18561 1.087240e+07 1.169443e+06 9420 -1.147236e+06 57.69 10.76 50.75
2 LATAM 17181 1.027761e+07 1.123322e+06 9588 -1.078166e+06 57.02 10.93 55.81
3 Pacific Asia 17577 8.273744e+06 8.577534e+05 7806 -8.930119e+05 57.32 10.37 44.41
4 USCA 8579 5.066529e+06 5.643138e+05 4812 -5.291920e+05 57.15 11.14 56.09
0 Africa 3854 2.294453e+06 2.520712e+05 2158 -2.359412e+05 56.81 10.99 55.99

Order Status and Cancellations

Show Code
ord_status = df.groupby('Order Status').agg(
    Orders = ('Order Id', 'nunique'),
    Sales  = ('Sales', 'sum'),
    Profit = ('Order Profit Per Order', 'sum')
).reset_index().sort_values('Orders', ascending=False)
ord_status['ProfitMargin'] = (
    ord_status['Profit'] / ord_status['Sales'] * 100
).round(2)
display(ord_status)
Order Status Orders Sales Profit ProfitMargin
2 COMPLETE 21716 1.209531e+07 1.321736e+06 10.93
6 PENDING_PAYMENT 14382 8.106698e+06 8.438102e+05 10.41
7 PROCESSING 7901 4.504064e+06 4.948259e+05 10.99
5 PENDING 7321 4.120533e+06 4.357259e+05 10.57
1 CLOSED 7249 4.022624e+06 4.579811e+05 11.39
3 ON_HOLD 3624 1.981543e+06 2.089130e+05 10.54
8 SUSPECTED_FRAUD 1488 8.259350e+05 8.513671e+04 10.31
0 CANCELED 1367 7.443704e+05 7.534563e+04 10.12
4 PAYMENT_REVIEW 704 3.836537e+05 4.342879e+04 11.32

Market Risk Visualization

Show Code
fig, axes = plt.subplots(1, 2, figsize=(16, 6))
fig.patch.set_facecolor(BG)
fig.suptitle('Figure VII: Market Risk and Loss Exposure',
             color=NAVY, fontsize=13,
             fontfamily='serif', fontweight='bold', y=1.02)

# Plot 1: Loss value by market
ax1 = axes[0]
ax1.set_facecolor(WHITE)
loss_colors = [RED if lv < 0 else TEAL
               for lv in risk['LossValue']]
bars = ax1.barh(range(len(risk)), risk['LossValue'],
                color=RED, edgecolor='none', height=0.65)
ax1.set_yticks(range(len(risk)))
ax1.set_yticklabels(risk['Market'],
                    color=MUTED, fontsize=9)
ax1.set_xlabel('Total Loss Value (USD)', color=MUTED)
ax1.set_title('Total Loss Value by Market',
              color=NAVY, fontweight='bold',
              pad=12, fontfamily='serif')
ax1.tick_params(colors=MUTED)
for spine in ax1.spines.values():
    spine.set_color(BORDER)
ax1.grid(color=BORDER, linestyle='--',
         linewidth=0.5, axis='x')
ax1.invert_yaxis()
ax1.xaxis.set_major_formatter(
    mticker.FuncFormatter(lambda x, _: f'${x/1000:.0f}K')
)

# Plot 2: Loss order percentage by market
ax2 = axes[1]
ax2.set_facecolor(WHITE)
lp_colors = [RED if r == risk['LossOrderPct'].max()
             else '#3a5a7c'
             for r in risk['LossOrderPct']]
bars2 = ax2.barh(range(len(risk)),
                 risk['LossOrderPct'],
                 color=lp_colors, edgecolor='none',
                 height=0.65)
ax2.set_yticks(range(len(risk)))
ax2.set_yticklabels(risk['Market'],
                    color=MUTED, fontsize=9)
ax2.set_xlabel('Loss Order % of Total', color=MUTED)
ax2.set_title('Loss Order Rate by Market',
              color=NAVY, fontweight='bold',
              pad=12, fontfamily='serif')
ax2.tick_params(colors=MUTED)
for spine in ax2.spines.values():
    spine.set_color(BORDER)
ax2.grid(color=BORDER, linestyle='--',
         linewidth=0.5, axis='x')
ax2.invert_yaxis()

plt.tight_layout()
plt.show()

Market Risk Findings and Recommendations

Recommendations:

  • Conduct an immediate deep-dive into the highest-loss markets, determine whether losses are structural (pricing, costs) or operational (discounts, returns)
  • Set a maximum acceptable loss order rate per market, markets exceeding this threshold trigger a mandatory profitability review
  • Review CANCELLED and SUSPECTED_FRAUD order statuses for patterns, these may represent revenue recovery opportunities

Excel Workbook

Workbook Structure

The analysis above has been packaged into a professional Excel workbook with eight structured sheets for operational use across DataCo Global.

Dashboard Tab

Excel Dashboard DataCo Global
Sheet 1 Executive Dashboard · DataCo Global Supply Chain Analytics

Sales and Profit Analysis

Excel Sales and Profit Analysis
Sheet 2 Sales and Profit by Market and Region

Delivery Performance

Excel  Delivery Performance
Sheet 3 Delivery Performance and Late Delivery Risk

Product Analysis

Excel Product Analysis
Sheet 4 Product Category and Department Analysis

Discount and Customer Analysis

Excel Discount and Customer Analysis
Sheet 5 Discount Impact and Customer Intelligence

Conclusions

Summary of Findings

Theme Key Finding Priority
Sales by Market Revenue concentrated in identifiable markets Baseline
Delivery Performance Late delivery rate is operationally significant Critical
Product Profitability Some categories operating at negative margin High
Discount Impact High discounts correlate with lower profit High
Customer Intelligence Top customers represent concentrated revenue risk High
Shipping vs Profit Delivery delays show measurable profit impact Medium
Market Risk Loss-making orders concentrated in specific markets Critical

The Three Most Urgent Actions

1. Address Late Delivery Rate Late deliveries are not just a customer satisfaction issue, they are a profitability issue. The data shows a measurable relationship between delivery performance and order profit. Every late delivery that could have been prevented is profit that was surrendered.

2. Review Discount Policy Immediately High discount rates are destroying margin. The data is unambiguous, as discount rates increase, profit per order falls sharply and in some bands turns negative. DataCo Global is discounting itself into losses.

3. Investigate Loss-Making Markets The markets with the highest concentration of loss-making orders require immediate operational review. These are not random, loss concentration has a cause, and that cause is identifiable and fixable.

Data Quality Notes

  • Product Description was entirely missing, dropped
  • Order Zipcode was 86% missing dropped
  • Customer personal details were redacted, analysis uses Customer Id for anonymised profiling
  • Date parsing was applied to both order and shipping date columns

Final Note to Marcus Adeyemi

DataCo Global’s supply chain data tells a clear story. The revenue is there. The customers are there. The products are there. What is eroding the bottom line is a combination of late deliveries, discount abuse, and loss concentration in identifiable markets. None of these are uncontrollable forces. All three are operational decisions that can be changed. The intelligence is here. The decisions are yours.


Report prepared by: Freda Erinmwingbovo, Data Analyst
Prepared for: Marcus Adeyemi, CEO DataCo Global
Date: February 2026
Tools: Python · pandas · matplotlib · Excel · Quarto
Dataset: DataCo Supply Chain Dataset · 180,519 orders · 53 attributes

This report was produced to professional data science standards. All findings are reproducible from the code cells above. No results have been overstated or adjusted.