library('tidyverse')
library('tidymodels')
library('themis')
library('caret')
swire_cust_enriched <- readRDS("data/swire_cust_enriched.Rds")
cluster_assignments_all <- readRDS("data/cluster_assignments.Rds")Delivery Standardization
Framework Exploration | IS 6813
Now let’s join those cluster assignments back to our original, enriched data set. This will allow us to explore the properties of each cluster.
swire_cust_clustered <-
swire_cust_enriched |>
mutate(
hclust = cluster_assignments_all[['hclust']],
kmeans = cluster_assignments_all[['kmeans']],
)We have 3 clusters. Let’s see how each method did to describe
Let’s proceed exploring kmeans; there’s a lot more balance in those results than hclust.
Cluster Exploration
Let’s start with kmeans. We’re going to look at the most important features that determine these clusters. Additionally, we’ll explore a handful of these in the data.
Here’s the function that will ingest our dataset, classifify against a binary target (1 = cluster of interest, 0 = all other clusters).
We’ll perform cross validation, take the best model, fit on the entire data set, and take the top coefficients.
get_elasnet_top_features <- function(data) {
# Configure recipe
mod_rec <- recipe(target ~ ., data) |>
step_dummy(all_nominal_predictors()) |>
step_zv(all_predictors()) |>
step_normalize(all_numeric_predictors()) |>
step_downsample(target)
# Setup cross-validation folds
mod_cv <- rsample::vfold_cv(
data, v = 5, strata = target
)
# Configure tuning grid
mod_tune_grid <- grid_random(
penalty(),
mixture(),
size = 20
)
# Setup model definition
mod_def <- logistic_reg(
mixture = tune(),
penalty = tune()
) |>
set_engine("glmnet")
# Configure workflow
mod_wflw <-
workflow() |>
add_model(mod_def) |>
add_recipe(mod_rec)
# Run cross-validated tuning
set.seed(814)
mod_tune <-
mod_wflw |>
tune_grid(
resamples = mod_cv,
grid = mod_tune_grid,
metrics = metric_set(roc_auc)
)
# print(collect_metrics(mod_tune))
# Select & fit best model
best_mod <- mod_tune |> select_best(metric = "roc_auc")
final_wflw <- mod_wflw |> finalize_workflow(best_mod)
final_fit <- fit(final_wflw, data = data)
# Capture the top predictors by absolute value of coefficient
tidy(final_fit) |>
arrange(desc(abs(estimate))) |>
filter(
term != '(Intercept)' & estimate != 0.0
) |>
select(-penalty)
}This will be a function used later:
get_cm_stats <- function(x, y) {
tbl <- table(as.integer(x), as.integer(y))
print(tbl)
print(caret::confusionMatrix(tbl))
}Cluster 1
Investigation
Let’s start with “Cluster 1”…
swire_cust_clustered |> filter(kmeans == "Cluster_1") |> nrow()[1] 23264
summary(swire_cust_clustered[swire_cust_clustered$kmeans == "Cluster_1",]$ordered_total_2023) Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0 21.5 87.5 155.2 195.5 16318.5
This cluster is made of 23K customers (nearly 77% of the total). We see a handy bit of variance, but most customers still fall in the “marginally low sales” category. Even the 75%tile customer is less than 200 gallons + cases ordered in 2023.
Let’s take a look at the features largely determining this group:
target_cluster_df <- swire_cust_clustered |>
mutate(target = factor(ifelse(kmeans == "Cluster_1", 1, 0))) |>
select(-c(customer_number, primary_group_number, kmeans, hclust))
C1_top10 <- get_elasnet_top_features(target_cluster_df)
C1_top10 %>% print(n = nrow(.))# A tibble: 111 × 2
term estimate
<chr> <dbl>
1 order_transactions_2023 -5.83
2 order_transactions_2024 -5.49
3 delivery_transactions_2024 -5.28
4 delivery_transactions_2023 -4.84
5 delivered_gallons_cost_2023 -3.70
6 delivered_gallons_cost_2024 -3.42
7 frequent_order_type_SALES.REP -3.04
8 customer_tenure_yrs -2.03
9 cold_drink_channel_GOODS 2.03
10 delivered_cases_cost_2024 -2.01
11 local_market_partner 1.97
12 loaded_gallons_2024 -1.97
13 delivered_gallons_2024 -1.96
14 delivered_cases_cost_2023 -1.88
15 loaded_gallons_2023 -1.87
16 ordered_gallons_2023 -1.81
17 ordered_gallons_2024 -1.69
18 delivered_gallons_2023 -1.49
19 return_frequency_2023 -1.46
20 frequent_order_type_OTHER -1.36
21 cold_drink_channel_BULK.TRADE -1.34
22 neighbor_local_market_partners -1.25
23 return_frequency_2024 -1.15
24 year_2023 -1.09
25 ramp_up_mon -1.06
26 trade_channel_FAST.CASUAL.DINING -0.975
27 ordered_total_2023 -0.954
28 order_transaction_std_2024 -0.917
29 loaded_total_2023 -0.913
30 delivered_total_2023 -0.854
31 order_transaction_std_2023 -0.756
32 delivered_total_2024 -0.684
33 trade_channel_EDUCATION 0.645
34 loaded_total_2024 -0.600
35 ordered_cases_2023 -0.597
36 cold_drink_channel_WELLNESS -0.579
37 sub_trade_channel_COMPREHENSIVE.PROVIDER -0.563
38 loaded_cases_2023 -0.546
39 delivered_cases_2023 -0.537
40 lat -0.532
41 co2_customer -0.468
42 ordered_total_2024 -0.462
43 neighbor_avg_order_transactions_2023 0.429
44 trade_channel_PROFESSIONAL.SERVICES 0.416
45 sub_trade_channel_OTHER.ACADEMIC.INSTITUTION -0.408
46 trade_channel_VEHICLE.CARE 0.386
47 frequent_order_type_EDI -0.379
48 sub_trade_channel_OTHER.VEHICLE.CARE 0.368
49 trade_channel_SUPERSTORE -0.367
50 sub_trade_channel_OTHER.PROFESSIONAL.SERVICES 0.366
51 trade_channel_OUTDOOR.ACTIVITIES 0.354
52 trade_channel_MOBILE.RETAIL 0.348
53 sub_trade_channel_ONLINE.STORE -0.347
54 frequent_order_type_MYCOKE.LEGACY -0.347
55 trade_channel_GENERAL -0.332
56 state_Massachusetts 0.331
57 trade_channel_SPECIALIZED.GOODS 0.323
58 delivered_cases_2024 -0.323
59 trade_channel_COMPREHENSIVE.DINING -0.323
60 trade_channel_HEALTHCARE -0.322
61 neighbor_avg_ordered_total_2024 0.320
62 sub_trade_channel_OTHER.GOODS 0.315
63 neighbor_avg_dist_km -0.304
64 trade_channel_GENERAL.RETAILER 0.301
65 sub_trade_channel_NON.RESTAURANT.EDUCATION 0.287
66 trade_channel_LICENSED.HOSPITALITY 0.286
67 sub_trade_channel_FSR...MISC -0.276
68 sub_trade_channel_MOBILE.RETAIL 0.276
69 sub_trade_channel_OTHER.HEALTHCARE -0.274
70 trade_channel_OTHER.DINING...BEVERAGE 0.271
71 trade_channel_ACCOMMODATION 0.247
72 loaded_cases_2024 -0.241
73 sub_trade_channel_OTHER.LICENSED.HOSPITALITY 0.227
74 state_Kentucky -0.227
75 sub_trade_channel_OTHER.DINING 0.213
76 sub_trade_channel_RECREATION.FILM -0.203
77 sub_trade_channel_OTHER.ACCOMMODATION 0.202
78 trade_channel_GOURMET.FOOD.RETAILER 0.198
79 sub_trade_channel_RECREATION.ARENA -0.191
80 neighbor_avg_return_freq -0.176
81 sub_trade_channel_OTHER.GOURMET.FOOD 0.165
82 sub_trade_channel_OTHER.OUTDOOR.ACTIVITIES 0.149
83 sub_trade_channel_CHICKEN.FAST.FOOD -0.143
84 ordered_cases_2024 -0.142
85 state_Louisiana -0.124
86 cold_drink_channel_WORKPLACE 0.111
87 neighbor_primary_group_count -0.110
88 sub_trade_channel_OTHER.RECREATION 0.110
89 neighbor_avg_order_transaction_std_2023 -0.0977
90 lon -0.0885
91 sub_trade_channel_HOME...HARDWARE 0.0862
92 neighbor_avg_order_transaction_std_2024 -0.0763
93 sub_trade_channel_FAITH 0.0756
94 primary_group_customers_2023 -0.0720
95 neighbor_avg_order_transactions_2024 -0.0657
96 trade_channel_PUBLIC.SECTOR..NON.MILITARY. -0.0538
97 sub_trade_channel_MEXICAN.FAST.FOOD 0.0449
98 sub_trade_channel_OTHER.PUBLIC.SECTOR -0.0441
99 cold_drink_channel_DINING -0.0381
100 trade_channel_ACTIVITIES -0.0318
101 primary_group_customers_2024 -0.0263
102 sub_trade_channel_PIZZA.FAST.FOOD -0.0249
103 sub_trade_channel_OTHER.FAST.FOOD 0.0247
104 state_Maryland -0.0230
105 frequent_order_type_MYCOKE360 0.0130
106 sub_trade_channel_HIGH.SCHOOL 0.0105
107 sub_trade_channel_BURGER.FAST.FOOD -0.00952
108 trade_channel_DEFENSE -0.00613
109 sub_trade_channel_OTHER.MILITARY -0.00392
110 sub_trade_channel_RESIDENTIAL -0.00355
111 sub_trade_channel_MISC 0.00278
As expected, measures of volume is the predominant theme in defining this group.
Definition
This is clearly the “WHITE TRUCK” group. This group have features of:
- Local market partner
- They don’t order through sales reps
- They usually come from “GOODS” channels, and not “BULK TRADE”
Quick rules for Segmentation
Now the question is, how do we derive a fairly simple “rule of thumb” to classify these customers?
What if we explore these features a bit:
C1_expl <-
swire_cust_clustered |>
mutate(
cluster_1 = kmeans == "Cluster_1",
order_type_flag = frequent_order_type != 'SALES REP',
cold_drink_flag = case_when(
cold_drink_channel == 'GOODS' | cold_drink_channel != 'BULK TRADE' ~ 1,
TRUE ~ 0
),
lmp_flag = local_market_partner,
)Now let’s compare who was flagged by these properties and are mapped to “Cluster 1”:
get_cm_stats(C1_expl$cluster_1, C1_expl$cold_drink_flag)
0 1
0 799 6259
1 521 22743
Confusion Matrix and Statistics
0 1
0 799 6259
1 521 22743
Accuracy : 0.7764
95% CI : (0.7717, 0.7811)
No Information Rate : 0.9565
P-Value [Acc > NIR] : 1
Kappa : 0.1267
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.60530
Specificity : 0.78419
Pos Pred Value : 0.11320
Neg Pred Value : 0.97760
Prevalence : 0.04353
Detection Rate : 0.02635
Detection Prevalence : 0.23277
Balanced Accuracy : 0.69475
'Positive' Class : 0
This does a very good job! Accuracy is high but we’re getting to many fals positives.
get_cm_stats(C1_expl$cluster_1, C1_expl$order_type_flag)
0 1
0 5523 1535
1 14406 8858
Confusion Matrix and Statistics
0 1
0 5523 1535
1 14406 8858
Accuracy : 0.4743
95% CI : (0.4686, 0.4799)
No Information Rate : 0.6572
P-Value [Acc > NIR] : 1
Kappa : 0.0999
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.2771
Specificity : 0.8523
Pos Pred Value : 0.7825
Neg Pred Value : 0.3808
Prevalence : 0.6572
Detection Rate : 0.1821
Detection Prevalence : 0.2328
Balanced Accuracy : 0.5647
'Positive' Class : 0
This one isn’t so good. There’s a fair amount of misclassifications. This gives the impression the feature is only helpful in conjuction with something else.
get_cm_stats(C1_expl$cluster_1, C1_expl$lmp_flag)
0 1
0 1808 5250
1 1293 21971
Confusion Matrix and Statistics
0 1
0 1808 5250
1 1293 21971
Accuracy : 0.7842
95% CI : (0.7795, 0.7888)
No Information Rate : 0.8977
P-Value [Acc > NIR] : 1
Kappa : 0.2493
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.58304
Specificity : 0.80713
Pos Pred Value : 0.25616
Neg Pred Value : 0.94442
Prevalence : 0.10227
Detection Rate : 0.05963
Detection Prevalence : 0.23277
Balanced Accuracy : 0.69509
'Positive' Class : 0
This one is fairly good, but still too many false positives.
Let’s see if greater than 2 of those conditions met is helpful!
C1_expl <- C1_expl |>
mutate(cond_2plus = (lmp_flag + order_type_flag + cold_drink_flag) > 1)get_cm_stats(C1_expl$cluster_1, C1_expl$cond_2plus)
0 1
0 1803 5255
1 1006 22258
Confusion Matrix and Statistics
0 1
0 1803 5255
1 1006 22258
Accuracy : 0.7935
95% CI : (0.7889, 0.7981)
No Information Rate : 0.9074
P-Value [Acc > NIR] : 1
Kappa : 0.2685
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.64187
Specificity : 0.80900
Pos Pred Value : 0.25545
Neg Pred Value : 0.95676
Prevalence : 0.09264
Detection Rate : 0.05946
Detection Prevalence : 0.23277
Balanced Accuracy : 0.72543
'Positive' Class : 0
This does fairly well. But there’s something we’re not capturing!
C1_expl |> filter(
(cluster_1 == 0 & cond_2plus == TRUE) |
(cluster_1 == 1 & cond_2plus == FALSE)
) |> group_by(cold_drink_channel) |> count()# A tibble: 9 × 2
# Groups: cold_drink_channel [9]
cold_drink_channel n
<fct> <int>
1 ACCOMMODATION 223
2 BULK TRADE 460
3 CONVENTIONAL 3
4 DINING 3876
5 EVENT 601
6 GOODS 487
7 PUBLIC SECTOR 284
8 WELLNESS 197
9 WORKPLACE 130
Cluster 3
Investigation
Let’s move on to “Cluster 3”…
swire_cust_clustered |> filter(kmeans == "Cluster_3") |> nrow()[1] 37
summary(swire_cust_clustered[swire_cust_clustered$kmeans == "Cluster_3",]$ordered_total_2023) Min. 1st Qu. Median Mean 3rd Qu. Max.
28565 39596 59692 93188 108038 376588
This cluster is made up of just 37 customers (nearly <1% of the total). This group is characterized by a very low floor and much tighter variance than we’ve seen from the previous two clusters.
Let’s take a look at the features largely determining this group:
target_cluster_df <- swire_cust_clustered |>
mutate(target = factor(ifelse(kmeans == "Cluster_3", 1, 0))) |>
select(-c(customer_number, primary_group_number, kmeans, hclust))
C3_top10 <- get_elasnet_top_features(target_cluster_df)
C3_top10 %>% print(n = nrow(.))# A tibble: 94 × 2
term estimate
<chr> <dbl>
1 cold_drink_channel_DINING -0.131
2 trade_channel_FAST.CASUAL.DINING -0.0832
3 order_transactions_2023 0.0741
4 delivery_transactions_2023 0.0729
5 cold_drink_channel_GOODS -0.0716
6 primary_group_customers_2023 -0.0646
7 cold_drink_channel_BULK.TRADE 0.0622
8 order_transactions_2024 0.0614
9 delivery_transactions_2024 0.0610
10 primary_group_customers_2024 -0.0585
11 neighbor_local_market_partners 0.0559
12 cold_drink_channel_PUBLIC.SECTOR -0.0525
13 trade_channel_COMPREHENSIVE.DINING -0.0523
14 sub_trade_channel_FSR...MISC -0.0523
15 co2_customer -0.0518
16 local_market_partner -0.0489
17 trade_channel_OTHER.DINING...BEVERAGE -0.0462
18 sub_trade_channel_OTHER.DINING -0.0462
19 trade_channel_ACCOMMODATION -0.0461
20 sub_trade_channel_OTHER.ACCOMMODATION -0.0461
21 year_2023 0.0446
22 trade_channel_GENERAL 0.0414
23 sub_trade_channel_COMPREHENSIVE.PROVIDER 0.0400
24 trade_channel_GENERAL.RETAILER -0.0399
25 sub_trade_channel_OTHER.OUTDOOR.ACTIVITIES -0.0398
26 trade_channel_OUTDOOR.ACTIVITIES -0.0391
27 frequent_order_type_SALES.REP 0.0387
28 trade_channel_LICENSED.HOSPITALITY -0.0381
29 sub_trade_channel_OTHER.LICENSED.HOSPITALITY -0.0381
30 sub_trade_channel_OTHER.RECREATION -0.0380
31 sub_trade_channel_PIZZA.FAST.FOOD -0.0350
32 sub_trade_channel_OTHER.GENERAL.RETAIL -0.0330
33 sub_trade_channel_RECREATION.ARENA 0.0329
34 frequent_order_type_MYCOKE360 -0.0325
35 sub_trade_channel_OTHER.PROFESSIONAL.SERVICES -0.0303
36 trade_channel_PROFESSIONAL.SERVICES -0.0303
37 customer_tenure_yrs 0.0289
38 state_Maryland -0.0285
39 cold_drink_channel_WORKPLACE 0.0274
40 neighbor_avg_dist_km -0.0263
41 sub_trade_channel_OTHER.ACADEMIC.INSTITUTION -0.0256
42 order_transaction_std_2023 0.0243
43 sub_trade_channel_RECREATION.FILM -0.0223
44 trade_channel_SPECIALIZED.GOODS -0.0207
45 sub_trade_channel_OTHER.GOODS -0.0207
46 trade_channel_SUPERSTORE 0.0197
47 trade_channel_ACTIVITIES 0.0196
48 sub_trade_channel_ONLINE.STORE 0.0195
49 order_transaction_std_2024 0.0185
50 sub_trade_channel_MEXICAN.FAST.FOOD -0.0176
51 ramp_up_mon 0.0175
52 frequent_order_type_MYCOKE.LEGACY -0.0173
53 cold_drink_channel_WELLNESS -0.0170
54 trade_channel_HEALTHCARE -0.0168
55 sub_trade_channel_OTHER.HEALTHCARE -0.0168
56 sub_trade_channel_GAME.CENTER 0.0166
57 return_frequency_2023 0.0160
58 neighbor_avg_return_freq 0.0156
59 delivered_gallons_cost_2024 0.0147
60 delivered_gallons_cost_2023 0.0145
61 cold_drink_channel_EVENT 0.0144
62 trade_channel_BULK.TRADE 0.0134
63 sub_trade_channel_BULK.TRADE 0.0133
64 trade_channel_TRAVEL 0.0123
65 delivered_cases_cost_2023 0.0117
66 return_frequency_2024 0.0112
67 year_2024 0.00951
68 delivered_cases_cost_2024 0.00938
69 sub_trade_channel_BURGER.FAST.FOOD -0.00881
70 trade_channel_DEFENSE -0.00878
71 sub_trade_channel_OTHER.MILITARY -0.00878
72 sub_trade_channel_CRUISE 0.00833
73 ordered_gallons_2024 0.00829
74 loaded_gallons_2024 0.00820
75 ordered_gallons_2023 0.00819
76 delivered_gallons_2024 0.00817
77 delivered_gallons_2023 0.00816
78 loaded_gallons_2023 0.00816
79 ordered_total_2023 0.00788
80 loaded_total_2023 0.00780
81 delivered_total_2023 0.00772
82 ordered_cases_2023 0.00671
83 loaded_cases_2023 0.00667
84 delivered_cases_2023 0.00662
85 loaded_total_2024 0.00637
86 ordered_total_2024 0.00635
87 delivered_total_2024 0.00634
88 loaded_cases_2024 0.00537
89 delivered_cases_2024 0.00535
90 ordered_cases_2024 0.00535
91 trade_channel_RECREATION -0.00316
92 sub_trade_channel_FAITH -0.00259
93 sub_trade_channel_CHICKEN.FAST.FOOD -0.00173
94 sub_trade_channel_RECREATION.PARK 0.00153
Definition
Clearly this group we’d define as the obvious “RED TRUCK” group. However, this is far too few to say exclusivley should be supported in this business model. Who else, for example boasts the potential for “RED TRUCK”.
Quick rules for Segmentation
So what are the simple “rules of thumb” to classify these customers?
Cluster 2
Investigation
We now circle back to “Cluster 2”.
swire_cust_clustered |> filter(kmeans == "Cluster_2") |> nrow()[1] 7021
summary(swire_cust_clustered[swire_cust_clustered$kmeans == "Cluster_2",]$ordered_total_2023) Min. 1st Qu. Median Mean 3rd Qu. Max.
0 480 815 1659 1526 60162
This cluster is made up of just over 7K customers (just of 23% of the total). Interestingly, this group doesn’t boast the same ceiling of Cluster 1 but has a higher center point than the others.
Let’s take a look at the features largely determining this group:
target_cluster_df <- swire_cust_clustered |>
mutate(target = factor(ifelse(kmeans == "Cluster_2", 1, 0))) |>
select(-c(customer_number, primary_group_number, kmeans, hclust))
C2_top10 <- get_elasnet_top_features(target_cluster_df)→ A | warning: from glmnet C++ code (error code -39); Convergence for 39th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned
There were issues with some computations A: x1
→ B | warning: from glmnet C++ code (error code -43); Convergence for 43th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned
There were issues with some computations A: x1
There were issues with some computations A: x1 B: x1
→ C | warning: from glmnet C++ code (error code -42); Convergence for 42th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned
There were issues with some computations A: x1 B: x1
There were issues with some computations A: x1 B: x1 C: x1
There were issues with some computations A: x1 B: x1 C: x1
C2_top10 %>% print(n = nrow(.))# A tibble: 89 × 2
term estimate
<chr> <dbl>
1 delivery_transactions_2024 1.22
2 order_transactions_2024 1.18
3 delivered_gallons_cost_2023 1.13
4 order_transactions_2023 1.12
5 delivery_transactions_2023 1.12
6 delivered_gallons_cost_2024 1.01
7 cold_drink_channel_GOODS -0.410
8 order_transaction_std_2024 0.380
9 local_market_partner -0.375
10 cold_drink_channel_BULK.TRADE 0.307
11 order_transaction_std_2023 0.305
12 customer_tenure_yrs 0.289
13 ramp_up_mon 0.273
14 delivered_total_2023 -0.264
15 loaded_total_2023 -0.252
16 frequent_order_type_SALES.REP 0.249
17 ordered_gallons_2023 -0.248
18 delivered_gallons_2023 -0.240
19 ordered_total_2023 -0.234
20 loaded_gallons_2023 -0.223
21 return_frequency_2023 0.200
22 ordered_total_2024 -0.184
23 delivered_gallons_2024 -0.177
24 ordered_gallons_2024 -0.164
25 trade_channel_EDUCATION -0.160
26 loaded_gallons_2024 -0.159
27 return_frequency_2024 0.131
28 sub_trade_channel_PIZZA.FAST.FOOD 0.130
29 loaded_total_2024 -0.123
30 neighbor_local_market_partners 0.112
31 sub_trade_channel_OTHER.GENERAL.RETAIL -0.110
32 sub_trade_channel_COMPREHENSIVE.PROVIDER 0.105
33 trade_channel_LICENSED.HOSPITALITY -0.103
34 loaded_cases_2023 -0.102
35 sub_trade_channel_OTHER.LICENSED.HOSPITALITY -0.102
36 delivered_cases_2023 -0.0999
37 delivered_total_2024 -0.0927
38 sub_trade_channel_MEXICAN.FAST.FOOD 0.0873
39 frequent_order_type_MYCOKE360 -0.0820
40 cold_drink_channel_WELLNESS 0.0776
41 sub_trade_channel_NON.RESTAURANT.EDUCATION -0.0775
42 trade_channel_OTHER.DINING...BEVERAGE -0.0736
43 sub_trade_channel_OTHER.DINING -0.0732
44 trade_channel_SPECIALIZED.GOODS -0.0660
45 sub_trade_channel_OTHER.GOODS -0.0659
46 trade_channel_ACCOMMODATION -0.0653
47 sub_trade_channel_OTHER.ACADEMIC.INSTITUTION 0.0650
48 sub_trade_channel_OTHER.ACCOMMODATION -0.0648
49 ordered_cases_2023 -0.0644
50 ordered_cases_2024 -0.0630
51 sub_trade_channel_MISC -0.0583
52 sub_trade_channel_RECREATION.FILM 0.0534
53 trade_channel_TRAVEL 0.0517
54 trade_channel_MOBILE.RETAIL -0.0487
55 sub_trade_channel_MOBILE.RETAIL -0.0485
56 primary_group_customers_2023 0.0476
57 trade_channel_VEHICLE.CARE -0.0418
58 sub_trade_channel_OTHER.VEHICLE.CARE -0.0417
59 trade_channel_HEALTHCARE 0.0401
60 sub_trade_channel_OTHER.HEALTHCARE 0.0394
61 sub_trade_channel_FRATERNITY -0.0373
62 neighbor_avg_ordered_total_2024 -0.0353
63 trade_channel_BULK.TRADE 0.0328
64 trade_channel_PROFESSIONAL.SERVICES -0.0312
65 sub_trade_channel_OTHER.PROFESSIONAL.SERVICES -0.0306
66 trade_channel_GOURMET.FOOD.RETAILER -0.0304
67 sub_trade_channel_BULK.TRADE 0.0303
68 sub_trade_channel_OTHER.GOURMET.FOOD -0.0301
69 neighbor_primary_group_count 0.0269
70 sub_trade_channel_RECREATION.PARK 0.0247
71 sub_trade_channel_CHICKEN.FAST.FOOD 0.0209
72 sub_trade_channel_BURGER.FAST.FOOD 0.0201
73 cold_drink_channel_WORKPLACE -0.0184
74 lon -0.0182
75 trade_channel_RECREATION 0.0141
76 loaded_cases_2024 -0.0134
77 trade_channel_SUPERSTORE 0.0101
78 sub_trade_channel_ONLINE.STORE 0.00967
79 neighbor_avg_order_transactions_2024 -0.00861
80 trade_channel_COMPREHENSIVE.DINING 0.00729
81 sub_trade_channel_FSR...MISC 0.00676
82 trade_channel_GENERAL 0.00583
83 sub_trade_channel_OTHER.TRAVEL 0.00551
84 primary_group_customers_2024 0.00532
85 state_Massachusetts -0.00396
86 trade_channel_PUBLIC.SECTOR..NON.MILITARY. 0.00288
87 sub_trade_channel_OTHER.PUBLIC.SECTOR 0.00237
88 sub_trade_channel_BOOKS...OFFICE -0.00216
89 frequent_order_type_OTHER 0.000936
Definition
Clearly this group we’d define as the obvious “RED TRUCK” group. However, this is far too few to say exclusivley should be supported in this business model. Who else, for example boasts the potential for “RED TRUCK”.
Quick rules for Segmentation
So what are the simple “rules of thumb” to classify these customers?