Improving Predictive Accuracy: FCP vs LASSO in Regression Modeling

Folded Concave Penalized (FCP) selection is a regression method designed to address some of the limitations of LASSO in predictive modeling. Both LASSO and FCP tend to perform best when the data is sparse and includes several unimportant predictors. Like LASSO, FCP shrinks coefficients relative to least squares estimates while simultaneously carrying out variable selection. The key distinction is that LASSO applies uniform shrinkage across all predictors, whereas FCP shrinks important predictors to a lesser degree, reducing potential bias in their parameter estimates. In this post, I compare the performance of FCP selection with LASSO and other approaches using a dataset available in Viya for Learners.

Review of LASSO and FCP selection

LASSO and Folded Concave Penalized (FCP) selection are both regularization methods that combine coefficient shrinkage with variable selection, making them well-suited for sparse data sets with many irrelevant predictors. LASSO applies uniform shrinkage across all predictors, which can introduce bias in the estimates of truly important variables. FCP, by contrast, uses a nonconvex penalty that shrinks unimportant predictors more aggressively while allowing important ones to retain larger coefficients, thereby reducing bias in their parameter estimates.

In practice, LASSO is valued for its simplicity and widespread use, while FCP potentially offers greater accuracy (in parameter estimates and predictions) by selectively shrinking predictors. For a broader review of FCP, LASSO, and Elastic Net selection (and guidance on when to use them) see my previous post: Folded concave penalized selection methods for linear regression…demystified!

Analysis data: PVA_Donors

The PVA_Donors dataset is publicly available and comes from a charity’s effort to re‑engage lapsing donors. The continuous target variable, Target_D, represents the dollar amount of donations received in response to solicitations. A version of this dataset is accessible in SAS Viya for Learners. Many of the 19K rows had missing values for donation amounts, so final sample size was N=4804, with N=3608 for training and N=1196 for validation.

For the analysis, I’ll use the SAS Viya procedure PROC REGSELECT, which supports multiple variable selection methods. The code below partitions the data into training and validation sets, specifies categorical predictors, and fits the model with different selection strategies:

proc regselect data=mylib.pva_final2;
 
partition role=role (train="train" validate="valid");

class URBANICITY SES HOME_OWNER DONOR_GENDER OVERLAY_SOURCE

RECENCY_STATUS_96NK;

model TARGET_D=MONTHS_SINCE_ORIGIN IN_HOUSE PUBLISHED_PHONE

MOR_HIT_RATE MEDIAN_HOME_VALUE MEDIAN_HOUSEHOLD_INCOME PCT_OWNER_OCCUPIED

PCT_MALE_MILITARY PCT_MALE_VETERANS PCT_VIETNAM_VETERANS PCT_WWII_VETERANS

PEP_STAR RECENT_STAR_STATUS FREQUENCY_STATUS_97NK RECENT_RESPONSE_PROP

RECENT_AVG_GIFT_AMT RECENT_CARD_RESPONSE_PROP RECENT_AVG_CARD_GIFT_AMT

RECENT_RESPONSE_COUNT RECENT_CARD_RESPONSE_COUNT MONTHS_SINCE_LAST_PROM_RESP

LIFETIME_CARD_PROM LIFETIME_PROM LIFETIME_GIFT_AMOUNT LIFETIME_GIFT_COUNT

LIFETIME_AVG_GIFT_AMT LIFETIME_GIFT_RANGE LIFETIME_MAX_GIFT_AMT

LIFETIME_MIN_GIFT_AMT CARD_PROM_12 NUMBER_PROM_12 MONTHS_SINCE_LAST_GIFT

MONTHS_SINCE_FIRST_GIFT FILE_AVG_GIFT FILE_CARD_GIFT PER_CAPITA_INCOME

IM_DONOR_AGE IM_INCOME_GROUP IM_WEALTH_RATING LAST_GIFT_AMT URBANICITY SES

HOME_OWNER DONOR_GENDER OVERLAY_SOURCE RECENCY_STATUS_96NK/ss3 vif;

/* selection method goes here */;

run;

Selection methods tested:

selection method=none;
selection method=stepwise (choose=validate select=SBC);
selection method=LASSO (choose=validate);
selection method=scad (choose=validate solver=nlp);
selection method=scad (choose=validate solver=milp);

These comparisons were chosen to highlight the strengths and limitations of different selection strategies. The baseline model without selection provides a reference point. Stepwise selection represents a traditional approach that balances fit with parsimony. LASSO is widely used for its simplicity and efficiency, while SCAD (implemented with both NLP and MILP solvers) illustrates how folded concave penalties can reduce bias in important predictors. Together, these methods offer a spectrum of approaches, making it easier to see how FCP compares in practice.

Comparisons

The table below reports a range of fit statistics for each selection method, with the best values in each row highlighted for easy comparison. These measures include error metrics, information criteria, and training versus validation ASE, providing a full view of model performance across approaches. Among these measures, the validation ASE provides the clearest picture of predictive performance, and the comparisons below focus on that statistic.

The key comparison is the Average Squared Error (ASE) on the validation data, which best reflects predictive performance on unseen cases. By this measure, FCP with the MILP solver achieved the lowest ASE (77.42), outperforming all other approaches. FCP with the NLP solver followed closely, while LASSO produced a validation ASE of 80.82—better than stepwise selection but slightly worse than the full model with no selection. It’s important to note, however, that the full model included many predictors with very high p‑values (above 0.45), suggesting limited statistical relevance despite its marginally lower ASE. While additional fit statistics such as AIC and SBC are included for completeness, they are secondary—the validation ASE shows that folded concave penalization can provide a meaningful edge in predictive accuracy. This dataset illustrates noisy data with many weak predictors, a setting where FCP is particularly effective because it shrinks unimportant variables more aggressively while preserving the signal from important ones. In high‑dimensional, noisy data like PVA_Donors, FCP stands out as a powerful alternative to traditional selection methods.

Links

Folded concave penalized selection methods for linear regression…demystified! SAS Communities blog by Tarek Elnaccash
Introducing Folded Concave Penalized Regression: New Variable Selection Methods in the REGSELECT Procedure in SAS® Viya® by Yingwei Wang (SAS Statistics Research and Applications Paper #2022-04)

Find more articles from SAS Global Enablement and Learning here.

Improving Predictive Accuracy: FCP vs LASSO in Regression Modeling

Ready to see what SAS Viya Copilot can do?

SAS AI and Machine Learning Courses