Hello,
I have aggregate data (no person-level data; all numbers and percentages) including predictor variables with multiple levels (age in the screenshot below has 5 levels) across 3 levels of an outcome (n% across the top of the screen shot below). My mentor is asking me to calculate standard differences for each categorical predictor variable across 3 levels of the outcome (realizing that we'll have to compare 2 vs 1, and 3 vs 1; instead of 1 vs 2 vs 3). I see that to use proc psmatch, the predictor variables have to be binary (0/1), but I don't think I can do this with the aggregated data that I have.
Does anyone know how I can calculate standard differences for categorical predictor variables as shown below?
Thank you in advance for your help!
Please post a link to your definition of "standard difference".
My searches turn up too many radically different things containing that phrase to want to spend any time guessing which is applicable.
If you are unable to post data step code describing your data then please at least post simple text in a text box opened on the forum with the </> icon that appears above the message window.
And indicate which are "outcomes". Four identical column headings obfuscates which are what for which purpose.
BTW there are procedures that do tests across multiple levels but the data needs to be in a reasonable form for specific tests.
I'm guessing that the OP knows about and has read Yang and Dalton ("A unified approach to measuring the effect size between two groups using SAS", SAS... which shows how to calculate "standardized difference scores" for two groups. The definition is on pp 2-3. Their macro is available from the Cleveland Clinic at https://www.lerner.ccf.org/quantitative-health/documents/stddiff.sas
It sounds like the OP wants to compute a similar measure for more than two groups.
Hi, I have read that article and have been trying to use the macro without success until this morning. I think I got it to work with my data (count data, see below). I knew that I could only compare outcomes 2 vs 1, and 3 vs 1, but was trying to figure out how to calculate a single standardized difference for all of age (5 levels). I realized this morning that I had to calculate a SMD for each level of age (as binary 0/1). I'm waiting on feedback from my mentor.
Here is the data:
Data age;
Input level age_grp severity count;
cards;
0 0 0 34967
0 1 0 109368
0 0 1 674
0 1 1 5992
0 0 2 133
0 1 2 4790
1 0 0 44727
1 1 0 99608
1 0 1 1293
1 1 1 5373
1 0 2 1342
1 1 2 3581
2 0 0 29442
2 1 0 114893
2 0 1 1074
2 1 1 5592
2 0 2 2035
2 1 2 2888
3 0 0 30257
3 1 0 114078
3 0 1 3077
3 1 1 3589
3 0 2 1314
3 1 2 3609
4 0 0 4942
4 1 0 139393
4 0 1 548
4 1 1 6118
4 0 2 99
4 1 2 4824
;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.