Delivery Standardization

Framework Exploration | IS 6813

Author

Adam Bushman (u6049169)

Published

March 30, 2025

library('tidyverse')
library('tidymodels')
library('themis')
library('caret')

swire_cust_enriched <- readRDS("data/swire_cust_enriched.Rds")
cluster_assignments_all <- readRDS("data/cluster_assignments.Rds")

Now let’s join those cluster assignments back to our original, enriched data set. This will allow us to explore the properties of each cluster.

swire_cust_clustered <- 
    swire_cust_enriched |> 
    mutate(
        hclust = cluster_assignments_all[['hclust']], 
        kmeans = cluster_assignments_all[['kmeans']], 
    )

We have 3 clusters. Let’s see how each method did to describe

Let’s proceed exploring kmeans; there’s a lot more balance in those results than hclust.

Cluster Exploration

Let’s start with kmeans. We’re going to look at the most important features that determine these clusters. Additionally, we’ll explore a handful of these in the data.

Here’s the function that will ingest our dataset, classifify against a binary target (1 = cluster of interest, 0 = all other clusters).

We’ll perform cross validation, take the best model, fit on the entire data set, and take the top coefficients.

get_elasnet_top_features <- function(data) {
    # Configure recipe
    mod_rec <- recipe(target ~ ., data) |>
        step_dummy(all_nominal_predictors()) |>
        step_zv(all_predictors()) |>
        step_normalize(all_numeric_predictors()) |>
        step_downsample(target)

    # Setup cross-validation folds
    mod_cv <- rsample::vfold_cv(
        data, v = 5, strata = target
    )

    # Configure tuning grid
    mod_tune_grid <- grid_random(
        penalty(),
        mixture(),
        size = 20
    )

    # Setup model definition
    mod_def <- logistic_reg(
        mixture = tune(),
        penalty = tune()
    ) |>
        set_engine("glmnet")

    # Configure workflow
    mod_wflw <-
        workflow() |>
        add_model(mod_def) |>
        add_recipe(mod_rec)

    # Run cross-validated tuning
    set.seed(814)
    mod_tune <-
        mod_wflw |>
        tune_grid(
            resamples = mod_cv,
            grid = mod_tune_grid,
            metrics = metric_set(roc_auc)
        )

    # print(collect_metrics(mod_tune))

    # Select & fit best model
    best_mod <- mod_tune |> select_best(metric = "roc_auc")
    final_wflw <- mod_wflw |> finalize_workflow(best_mod)
    final_fit <- fit(final_wflw, data = data)

    # Capture the top predictors by absolute value of coefficient
    tidy(final_fit) |>
        arrange(desc(abs(estimate))) |>
        filter(
            term != '(Intercept)' & estimate != 0.0
        ) |>
        select(-penalty)
}

This will be a function used later:

get_cm_stats <- function(x, y) {
    tbl <- table(as.integer(x), as.integer(y))
    print(tbl)
    print(caret::confusionMatrix(tbl))
}

Cluster 1

Investigation

Let’s start with “Cluster 1”…

swire_cust_clustered |> filter(kmeans == "Cluster_1") |> nrow()
[1] 23264
summary(swire_cust_clustered[swire_cust_clustered$kmeans == "Cluster_1",]$ordered_total_2023)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    0.0    21.5    87.5   155.2   195.5 16318.5 

This cluster is made of 23K customers (nearly 77% of the total). We see a handy bit of variance, but most customers still fall in the “marginally low sales” category. Even the 75%tile customer is less than 200 gallons + cases ordered in 2023.

Let’s take a look at the features largely determining this group:

target_cluster_df <- swire_cust_clustered |>
    mutate(target = factor(ifelse(kmeans == "Cluster_1", 1, 0))) |>
    select(-c(customer_number, primary_group_number, kmeans, hclust))

C1_top10 <- get_elasnet_top_features(target_cluster_df)

C1_top10 %>% print(n = nrow(.))
# A tibble: 111 × 2
    term                                          estimate
    <chr>                                            <dbl>
  1 order_transactions_2023                       -5.83   
  2 order_transactions_2024                       -5.49   
  3 delivery_transactions_2024                    -5.28   
  4 delivery_transactions_2023                    -4.84   
  5 delivered_gallons_cost_2023                   -3.70   
  6 delivered_gallons_cost_2024                   -3.42   
  7 frequent_order_type_SALES.REP                 -3.04   
  8 customer_tenure_yrs                           -2.03   
  9 cold_drink_channel_GOODS                       2.03   
 10 delivered_cases_cost_2024                     -2.01   
 11 local_market_partner                           1.97   
 12 loaded_gallons_2024                           -1.97   
 13 delivered_gallons_2024                        -1.96   
 14 delivered_cases_cost_2023                     -1.88   
 15 loaded_gallons_2023                           -1.87   
 16 ordered_gallons_2023                          -1.81   
 17 ordered_gallons_2024                          -1.69   
 18 delivered_gallons_2023                        -1.49   
 19 return_frequency_2023                         -1.46   
 20 frequent_order_type_OTHER                     -1.36   
 21 cold_drink_channel_BULK.TRADE                 -1.34   
 22 neighbor_local_market_partners                -1.25   
 23 return_frequency_2024                         -1.15   
 24 year_2023                                     -1.09   
 25 ramp_up_mon                                   -1.06   
 26 trade_channel_FAST.CASUAL.DINING              -0.975  
 27 ordered_total_2023                            -0.954  
 28 order_transaction_std_2024                    -0.917  
 29 loaded_total_2023                             -0.913  
 30 delivered_total_2023                          -0.854  
 31 order_transaction_std_2023                    -0.756  
 32 delivered_total_2024                          -0.684  
 33 trade_channel_EDUCATION                        0.645  
 34 loaded_total_2024                             -0.600  
 35 ordered_cases_2023                            -0.597  
 36 cold_drink_channel_WELLNESS                   -0.579  
 37 sub_trade_channel_COMPREHENSIVE.PROVIDER      -0.563  
 38 loaded_cases_2023                             -0.546  
 39 delivered_cases_2023                          -0.537  
 40 lat                                           -0.532  
 41 co2_customer                                  -0.468  
 42 ordered_total_2024                            -0.462  
 43 neighbor_avg_order_transactions_2023           0.429  
 44 trade_channel_PROFESSIONAL.SERVICES            0.416  
 45 sub_trade_channel_OTHER.ACADEMIC.INSTITUTION  -0.408  
 46 trade_channel_VEHICLE.CARE                     0.386  
 47 frequent_order_type_EDI                       -0.379  
 48 sub_trade_channel_OTHER.VEHICLE.CARE           0.368  
 49 trade_channel_SUPERSTORE                      -0.367  
 50 sub_trade_channel_OTHER.PROFESSIONAL.SERVICES  0.366  
 51 trade_channel_OUTDOOR.ACTIVITIES               0.354  
 52 trade_channel_MOBILE.RETAIL                    0.348  
 53 sub_trade_channel_ONLINE.STORE                -0.347  
 54 frequent_order_type_MYCOKE.LEGACY             -0.347  
 55 trade_channel_GENERAL                         -0.332  
 56 state_Massachusetts                            0.331  
 57 trade_channel_SPECIALIZED.GOODS                0.323  
 58 delivered_cases_2024                          -0.323  
 59 trade_channel_COMPREHENSIVE.DINING            -0.323  
 60 trade_channel_HEALTHCARE                      -0.322  
 61 neighbor_avg_ordered_total_2024                0.320  
 62 sub_trade_channel_OTHER.GOODS                  0.315  
 63 neighbor_avg_dist_km                          -0.304  
 64 trade_channel_GENERAL.RETAILER                 0.301  
 65 sub_trade_channel_NON.RESTAURANT.EDUCATION     0.287  
 66 trade_channel_LICENSED.HOSPITALITY             0.286  
 67 sub_trade_channel_FSR...MISC                  -0.276  
 68 sub_trade_channel_MOBILE.RETAIL                0.276  
 69 sub_trade_channel_OTHER.HEALTHCARE            -0.274  
 70 trade_channel_OTHER.DINING...BEVERAGE          0.271  
 71 trade_channel_ACCOMMODATION                    0.247  
 72 loaded_cases_2024                             -0.241  
 73 sub_trade_channel_OTHER.LICENSED.HOSPITALITY   0.227  
 74 state_Kentucky                                -0.227  
 75 sub_trade_channel_OTHER.DINING                 0.213  
 76 sub_trade_channel_RECREATION.FILM             -0.203  
 77 sub_trade_channel_OTHER.ACCOMMODATION          0.202  
 78 trade_channel_GOURMET.FOOD.RETAILER            0.198  
 79 sub_trade_channel_RECREATION.ARENA            -0.191  
 80 neighbor_avg_return_freq                      -0.176  
 81 sub_trade_channel_OTHER.GOURMET.FOOD           0.165  
 82 sub_trade_channel_OTHER.OUTDOOR.ACTIVITIES     0.149  
 83 sub_trade_channel_CHICKEN.FAST.FOOD           -0.143  
 84 ordered_cases_2024                            -0.142  
 85 state_Louisiana                               -0.124  
 86 cold_drink_channel_WORKPLACE                   0.111  
 87 neighbor_primary_group_count                  -0.110  
 88 sub_trade_channel_OTHER.RECREATION             0.110  
 89 neighbor_avg_order_transaction_std_2023       -0.0977 
 90 lon                                           -0.0885 
 91 sub_trade_channel_HOME...HARDWARE              0.0862 
 92 neighbor_avg_order_transaction_std_2024       -0.0763 
 93 sub_trade_channel_FAITH                        0.0756 
 94 primary_group_customers_2023                  -0.0720 
 95 neighbor_avg_order_transactions_2024          -0.0657 
 96 trade_channel_PUBLIC.SECTOR..NON.MILITARY.    -0.0538 
 97 sub_trade_channel_MEXICAN.FAST.FOOD            0.0449 
 98 sub_trade_channel_OTHER.PUBLIC.SECTOR         -0.0441 
 99 cold_drink_channel_DINING                     -0.0381 
100 trade_channel_ACTIVITIES                      -0.0318 
101 primary_group_customers_2024                  -0.0263 
102 sub_trade_channel_PIZZA.FAST.FOOD             -0.0249 
103 sub_trade_channel_OTHER.FAST.FOOD              0.0247 
104 state_Maryland                                -0.0230 
105 frequent_order_type_MYCOKE360                  0.0130 
106 sub_trade_channel_HIGH.SCHOOL                  0.0105 
107 sub_trade_channel_BURGER.FAST.FOOD            -0.00952
108 trade_channel_DEFENSE                         -0.00613
109 sub_trade_channel_OTHER.MILITARY              -0.00392
110 sub_trade_channel_RESIDENTIAL                 -0.00355
111 sub_trade_channel_MISC                         0.00278

As expected, measures of volume is the predominant theme in defining this group.

Definition

This is clearly the “WHITE TRUCK” group. This group have features of:

  1. Local market partner
  2. They don’t order through sales reps
  3. They usually come from “GOODS” channels, and not “BULK TRADE”

Quick rules for Segmentation

Now the question is, how do we derive a fairly simple “rule of thumb” to classify these customers?

What if we explore these features a bit:

C1_expl <- 
    swire_cust_clustered |> 
    mutate(
        cluster_1 = kmeans == "Cluster_1", 
        order_type_flag = frequent_order_type != 'SALES REP', 
        cold_drink_flag = case_when(
            cold_drink_channel == 'GOODS' | cold_drink_channel != 'BULK TRADE' ~ 1, 
            TRUE ~ 0
        ), 
        lmp_flag = local_market_partner, 

    )

Now let’s compare who was flagged by these properties and are mapped to “Cluster 1”:

get_cm_stats(C1_expl$cluster_1, C1_expl$cold_drink_flag)
   
        0     1
  0   799  6259
  1   521 22743
Confusion Matrix and Statistics

   
        0     1
  0   799  6259
  1   521 22743
                                          
               Accuracy : 0.7764          
                 95% CI : (0.7717, 0.7811)
    No Information Rate : 0.9565          
    P-Value [Acc > NIR] : 1               
                                          
                  Kappa : 0.1267          
                                          
 Mcnemar's Test P-Value : <2e-16          
                                          
            Sensitivity : 0.60530         
            Specificity : 0.78419         
         Pos Pred Value : 0.11320         
         Neg Pred Value : 0.97760         
             Prevalence : 0.04353         
         Detection Rate : 0.02635         
   Detection Prevalence : 0.23277         
      Balanced Accuracy : 0.69475         
                                          
       'Positive' Class : 0               
                                          

This does a very good job! Accuracy is high but we’re getting to many fals positives.

get_cm_stats(C1_expl$cluster_1, C1_expl$order_type_flag)
   
        0     1
  0  5523  1535
  1 14406  8858
Confusion Matrix and Statistics

   
        0     1
  0  5523  1535
  1 14406  8858
                                          
               Accuracy : 0.4743          
                 95% CI : (0.4686, 0.4799)
    No Information Rate : 0.6572          
    P-Value [Acc > NIR] : 1               
                                          
                  Kappa : 0.0999          
                                          
 Mcnemar's Test P-Value : <2e-16          
                                          
            Sensitivity : 0.2771          
            Specificity : 0.8523          
         Pos Pred Value : 0.7825          
         Neg Pred Value : 0.3808          
             Prevalence : 0.6572          
         Detection Rate : 0.1821          
   Detection Prevalence : 0.2328          
      Balanced Accuracy : 0.5647          
                                          
       'Positive' Class : 0               
                                          

This one isn’t so good. There’s a fair amount of misclassifications. This gives the impression the feature is only helpful in conjuction with something else.

get_cm_stats(C1_expl$cluster_1, C1_expl$lmp_flag)
   
        0     1
  0  1808  5250
  1  1293 21971
Confusion Matrix and Statistics

   
        0     1
  0  1808  5250
  1  1293 21971
                                          
               Accuracy : 0.7842          
                 95% CI : (0.7795, 0.7888)
    No Information Rate : 0.8977          
    P-Value [Acc > NIR] : 1               
                                          
                  Kappa : 0.2493          
                                          
 Mcnemar's Test P-Value : <2e-16          
                                          
            Sensitivity : 0.58304         
            Specificity : 0.80713         
         Pos Pred Value : 0.25616         
         Neg Pred Value : 0.94442         
             Prevalence : 0.10227         
         Detection Rate : 0.05963         
   Detection Prevalence : 0.23277         
      Balanced Accuracy : 0.69509         
                                          
       'Positive' Class : 0               
                                          

This one is fairly good, but still too many false positives.

Let’s see if greater than 2 of those conditions met is helpful!

C1_expl <- C1_expl |> 
    mutate(cond_2plus = (lmp_flag + order_type_flag + cold_drink_flag) > 1)
get_cm_stats(C1_expl$cluster_1, C1_expl$cond_2plus)
   
        0     1
  0  1803  5255
  1  1006 22258
Confusion Matrix and Statistics

   
        0     1
  0  1803  5255
  1  1006 22258
                                          
               Accuracy : 0.7935          
                 95% CI : (0.7889, 0.7981)
    No Information Rate : 0.9074          
    P-Value [Acc > NIR] : 1               
                                          
                  Kappa : 0.2685          
                                          
 Mcnemar's Test P-Value : <2e-16          
                                          
            Sensitivity : 0.64187         
            Specificity : 0.80900         
         Pos Pred Value : 0.25545         
         Neg Pred Value : 0.95676         
             Prevalence : 0.09264         
         Detection Rate : 0.05946         
   Detection Prevalence : 0.23277         
      Balanced Accuracy : 0.72543         
                                          
       'Positive' Class : 0               
                                          

This does fairly well. But there’s something we’re not capturing!

C1_expl |> filter(
    (cluster_1 == 0 & cond_2plus == TRUE) |
    (cluster_1 == 1 & cond_2plus == FALSE) 
) |> group_by(cold_drink_channel) |> count()
# A tibble: 9 × 2
# Groups:   cold_drink_channel [9]
  cold_drink_channel     n
  <fct>              <int>
1 ACCOMMODATION        223
2 BULK TRADE           460
3 CONVENTIONAL           3
4 DINING              3876
5 EVENT                601
6 GOODS                487
7 PUBLIC SECTOR        284
8 WELLNESS             197
9 WORKPLACE            130

Cluster 3

Investigation

Let’s move on to “Cluster 3”…

swire_cust_clustered |> filter(kmeans == "Cluster_3") |> nrow()
[1] 37
summary(swire_cust_clustered[swire_cust_clustered$kmeans == "Cluster_3",]$ordered_total_2023)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  28565   39596   59692   93188  108038  376588 

This cluster is made up of just 37 customers (nearly <1% of the total). This group is characterized by a very low floor and much tighter variance than we’ve seen from the previous two clusters.

Let’s take a look at the features largely determining this group:

target_cluster_df <- swire_cust_clustered |>
    mutate(target = factor(ifelse(kmeans == "Cluster_3", 1, 0))) |>
    select(-c(customer_number, primary_group_number, kmeans, hclust))

C3_top10 <- get_elasnet_top_features(target_cluster_df)

C3_top10 %>% print(n = nrow(.))
# A tibble: 94 × 2
   term                                          estimate
   <chr>                                            <dbl>
 1 cold_drink_channel_DINING                     -0.131  
 2 trade_channel_FAST.CASUAL.DINING              -0.0832 
 3 order_transactions_2023                        0.0741 
 4 delivery_transactions_2023                     0.0729 
 5 cold_drink_channel_GOODS                      -0.0716 
 6 primary_group_customers_2023                  -0.0646 
 7 cold_drink_channel_BULK.TRADE                  0.0622 
 8 order_transactions_2024                        0.0614 
 9 delivery_transactions_2024                     0.0610 
10 primary_group_customers_2024                  -0.0585 
11 neighbor_local_market_partners                 0.0559 
12 cold_drink_channel_PUBLIC.SECTOR              -0.0525 
13 trade_channel_COMPREHENSIVE.DINING            -0.0523 
14 sub_trade_channel_FSR...MISC                  -0.0523 
15 co2_customer                                  -0.0518 
16 local_market_partner                          -0.0489 
17 trade_channel_OTHER.DINING...BEVERAGE         -0.0462 
18 sub_trade_channel_OTHER.DINING                -0.0462 
19 trade_channel_ACCOMMODATION                   -0.0461 
20 sub_trade_channel_OTHER.ACCOMMODATION         -0.0461 
21 year_2023                                      0.0446 
22 trade_channel_GENERAL                          0.0414 
23 sub_trade_channel_COMPREHENSIVE.PROVIDER       0.0400 
24 trade_channel_GENERAL.RETAILER                -0.0399 
25 sub_trade_channel_OTHER.OUTDOOR.ACTIVITIES    -0.0398 
26 trade_channel_OUTDOOR.ACTIVITIES              -0.0391 
27 frequent_order_type_SALES.REP                  0.0387 
28 trade_channel_LICENSED.HOSPITALITY            -0.0381 
29 sub_trade_channel_OTHER.LICENSED.HOSPITALITY  -0.0381 
30 sub_trade_channel_OTHER.RECREATION            -0.0380 
31 sub_trade_channel_PIZZA.FAST.FOOD             -0.0350 
32 sub_trade_channel_OTHER.GENERAL.RETAIL        -0.0330 
33 sub_trade_channel_RECREATION.ARENA             0.0329 
34 frequent_order_type_MYCOKE360                 -0.0325 
35 sub_trade_channel_OTHER.PROFESSIONAL.SERVICES -0.0303 
36 trade_channel_PROFESSIONAL.SERVICES           -0.0303 
37 customer_tenure_yrs                            0.0289 
38 state_Maryland                                -0.0285 
39 cold_drink_channel_WORKPLACE                   0.0274 
40 neighbor_avg_dist_km                          -0.0263 
41 sub_trade_channel_OTHER.ACADEMIC.INSTITUTION  -0.0256 
42 order_transaction_std_2023                     0.0243 
43 sub_trade_channel_RECREATION.FILM             -0.0223 
44 trade_channel_SPECIALIZED.GOODS               -0.0207 
45 sub_trade_channel_OTHER.GOODS                 -0.0207 
46 trade_channel_SUPERSTORE                       0.0197 
47 trade_channel_ACTIVITIES                       0.0196 
48 sub_trade_channel_ONLINE.STORE                 0.0195 
49 order_transaction_std_2024                     0.0185 
50 sub_trade_channel_MEXICAN.FAST.FOOD           -0.0176 
51 ramp_up_mon                                    0.0175 
52 frequent_order_type_MYCOKE.LEGACY             -0.0173 
53 cold_drink_channel_WELLNESS                   -0.0170 
54 trade_channel_HEALTHCARE                      -0.0168 
55 sub_trade_channel_OTHER.HEALTHCARE            -0.0168 
56 sub_trade_channel_GAME.CENTER                  0.0166 
57 return_frequency_2023                          0.0160 
58 neighbor_avg_return_freq                       0.0156 
59 delivered_gallons_cost_2024                    0.0147 
60 delivered_gallons_cost_2023                    0.0145 
61 cold_drink_channel_EVENT                       0.0144 
62 trade_channel_BULK.TRADE                       0.0134 
63 sub_trade_channel_BULK.TRADE                   0.0133 
64 trade_channel_TRAVEL                           0.0123 
65 delivered_cases_cost_2023                      0.0117 
66 return_frequency_2024                          0.0112 
67 year_2024                                      0.00951
68 delivered_cases_cost_2024                      0.00938
69 sub_trade_channel_BURGER.FAST.FOOD            -0.00881
70 trade_channel_DEFENSE                         -0.00878
71 sub_trade_channel_OTHER.MILITARY              -0.00878
72 sub_trade_channel_CRUISE                       0.00833
73 ordered_gallons_2024                           0.00829
74 loaded_gallons_2024                            0.00820
75 ordered_gallons_2023                           0.00819
76 delivered_gallons_2024                         0.00817
77 delivered_gallons_2023                         0.00816
78 loaded_gallons_2023                            0.00816
79 ordered_total_2023                             0.00788
80 loaded_total_2023                              0.00780
81 delivered_total_2023                           0.00772
82 ordered_cases_2023                             0.00671
83 loaded_cases_2023                              0.00667
84 delivered_cases_2023                           0.00662
85 loaded_total_2024                              0.00637
86 ordered_total_2024                             0.00635
87 delivered_total_2024                           0.00634
88 loaded_cases_2024                              0.00537
89 delivered_cases_2024                           0.00535
90 ordered_cases_2024                             0.00535
91 trade_channel_RECREATION                      -0.00316
92 sub_trade_channel_FAITH                       -0.00259
93 sub_trade_channel_CHICKEN.FAST.FOOD           -0.00173
94 sub_trade_channel_RECREATION.PARK              0.00153

Definition

Clearly this group we’d define as the obvious “RED TRUCK” group. However, this is far too few to say exclusivley should be supported in this business model. Who else, for example boasts the potential for “RED TRUCK”.

Quick rules for Segmentation

So what are the simple “rules of thumb” to classify these customers?

Cluster 2

Investigation

We now circle back to “Cluster 2”.

swire_cust_clustered |> filter(kmeans == "Cluster_2") |> nrow()
[1] 7021
summary(swire_cust_clustered[swire_cust_clustered$kmeans == "Cluster_2",]$ordered_total_2023)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      0     480     815    1659    1526   60162 

This cluster is made up of just over 7K customers (just of 23% of the total). Interestingly, this group doesn’t boast the same ceiling of Cluster 1 but has a higher center point than the others.

Let’s take a look at the features largely determining this group:

target_cluster_df <- swire_cust_clustered |>
    mutate(target = factor(ifelse(kmeans == "Cluster_2", 1, 0))) |>
    select(-c(customer_number, primary_group_number, kmeans, hclust))

C2_top10 <- get_elasnet_top_features(target_cluster_df)
→ A | warning: from glmnet C++ code (error code -39); Convergence for 39th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned
There were issues with some computations   A: x1
→ B | warning: from glmnet C++ code (error code -43); Convergence for 43th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned
There were issues with some computations   A: x1
There were issues with some computations   A: x1   B: x1
→ C | warning: from glmnet C++ code (error code -42); Convergence for 42th lambda value not reached after maxit=100000 iterations; solutions for larger lambdas returned
There were issues with some computations   A: x1   B: x1
There were issues with some computations   A: x1   B: x1   C: x1
There were issues with some computations   A: x1   B: x1   C: x1
C2_top10 %>% print(n = nrow(.))
# A tibble: 89 × 2
   term                                           estimate
   <chr>                                             <dbl>
 1 delivery_transactions_2024                     1.22    
 2 order_transactions_2024                        1.18    
 3 delivered_gallons_cost_2023                    1.13    
 4 order_transactions_2023                        1.12    
 5 delivery_transactions_2023                     1.12    
 6 delivered_gallons_cost_2024                    1.01    
 7 cold_drink_channel_GOODS                      -0.410   
 8 order_transaction_std_2024                     0.380   
 9 local_market_partner                          -0.375   
10 cold_drink_channel_BULK.TRADE                  0.307   
11 order_transaction_std_2023                     0.305   
12 customer_tenure_yrs                            0.289   
13 ramp_up_mon                                    0.273   
14 delivered_total_2023                          -0.264   
15 loaded_total_2023                             -0.252   
16 frequent_order_type_SALES.REP                  0.249   
17 ordered_gallons_2023                          -0.248   
18 delivered_gallons_2023                        -0.240   
19 ordered_total_2023                            -0.234   
20 loaded_gallons_2023                           -0.223   
21 return_frequency_2023                          0.200   
22 ordered_total_2024                            -0.184   
23 delivered_gallons_2024                        -0.177   
24 ordered_gallons_2024                          -0.164   
25 trade_channel_EDUCATION                       -0.160   
26 loaded_gallons_2024                           -0.159   
27 return_frequency_2024                          0.131   
28 sub_trade_channel_PIZZA.FAST.FOOD              0.130   
29 loaded_total_2024                             -0.123   
30 neighbor_local_market_partners                 0.112   
31 sub_trade_channel_OTHER.GENERAL.RETAIL        -0.110   
32 sub_trade_channel_COMPREHENSIVE.PROVIDER       0.105   
33 trade_channel_LICENSED.HOSPITALITY            -0.103   
34 loaded_cases_2023                             -0.102   
35 sub_trade_channel_OTHER.LICENSED.HOSPITALITY  -0.102   
36 delivered_cases_2023                          -0.0999  
37 delivered_total_2024                          -0.0927  
38 sub_trade_channel_MEXICAN.FAST.FOOD            0.0873  
39 frequent_order_type_MYCOKE360                 -0.0820  
40 cold_drink_channel_WELLNESS                    0.0776  
41 sub_trade_channel_NON.RESTAURANT.EDUCATION    -0.0775  
42 trade_channel_OTHER.DINING...BEVERAGE         -0.0736  
43 sub_trade_channel_OTHER.DINING                -0.0732  
44 trade_channel_SPECIALIZED.GOODS               -0.0660  
45 sub_trade_channel_OTHER.GOODS                 -0.0659  
46 trade_channel_ACCOMMODATION                   -0.0653  
47 sub_trade_channel_OTHER.ACADEMIC.INSTITUTION   0.0650  
48 sub_trade_channel_OTHER.ACCOMMODATION         -0.0648  
49 ordered_cases_2023                            -0.0644  
50 ordered_cases_2024                            -0.0630  
51 sub_trade_channel_MISC                        -0.0583  
52 sub_trade_channel_RECREATION.FILM              0.0534  
53 trade_channel_TRAVEL                           0.0517  
54 trade_channel_MOBILE.RETAIL                   -0.0487  
55 sub_trade_channel_MOBILE.RETAIL               -0.0485  
56 primary_group_customers_2023                   0.0476  
57 trade_channel_VEHICLE.CARE                    -0.0418  
58 sub_trade_channel_OTHER.VEHICLE.CARE          -0.0417  
59 trade_channel_HEALTHCARE                       0.0401  
60 sub_trade_channel_OTHER.HEALTHCARE             0.0394  
61 sub_trade_channel_FRATERNITY                  -0.0373  
62 neighbor_avg_ordered_total_2024               -0.0353  
63 trade_channel_BULK.TRADE                       0.0328  
64 trade_channel_PROFESSIONAL.SERVICES           -0.0312  
65 sub_trade_channel_OTHER.PROFESSIONAL.SERVICES -0.0306  
66 trade_channel_GOURMET.FOOD.RETAILER           -0.0304  
67 sub_trade_channel_BULK.TRADE                   0.0303  
68 sub_trade_channel_OTHER.GOURMET.FOOD          -0.0301  
69 neighbor_primary_group_count                   0.0269  
70 sub_trade_channel_RECREATION.PARK              0.0247  
71 sub_trade_channel_CHICKEN.FAST.FOOD            0.0209  
72 sub_trade_channel_BURGER.FAST.FOOD             0.0201  
73 cold_drink_channel_WORKPLACE                  -0.0184  
74 lon                                           -0.0182  
75 trade_channel_RECREATION                       0.0141  
76 loaded_cases_2024                             -0.0134  
77 trade_channel_SUPERSTORE                       0.0101  
78 sub_trade_channel_ONLINE.STORE                 0.00967 
79 neighbor_avg_order_transactions_2024          -0.00861 
80 trade_channel_COMPREHENSIVE.DINING             0.00729 
81 sub_trade_channel_FSR...MISC                   0.00676 
82 trade_channel_GENERAL                          0.00583 
83 sub_trade_channel_OTHER.TRAVEL                 0.00551 
84 primary_group_customers_2024                   0.00532 
85 state_Massachusetts                           -0.00396 
86 trade_channel_PUBLIC.SECTOR..NON.MILITARY.     0.00288 
87 sub_trade_channel_OTHER.PUBLIC.SECTOR          0.00237 
88 sub_trade_channel_BOOKS...OFFICE              -0.00216 
89 frequent_order_type_OTHER                      0.000936

Definition

Clearly this group we’d define as the obvious “RED TRUCK” group. However, this is far too few to say exclusivley should be supported in this business model. Who else, for example boasts the potential for “RED TRUCK”.

Quick rules for Segmentation

So what are the simple “rules of thumb” to classify these customers?