I have 4 variables that have the same values array. I need to count the sum of frequencies of certain values in all 4 variables when all variables are equal "a".
For example:
I have variables A, B, C, D and they can have a,b,c,d,e,f values. I need to get the sum of frequencies "b", "c", "d", "f" over the A, B, C, D when A="a" and B="a" and C="a" and D="a".
Can anybody help?
Thank you.
Arthur solution is the correct methodology.
Step 1 - Identify patients who have alcohol use (flag=1 for alcohol use, flag=0 for no alcohol use)
Step 2 - Transpose data (proc transpose)
Step 3 - Run a proc freq on the transpose data- diagnosis by flag ( should probably also run a chi-sq to check if its different between patients who don't have alcohol abuse).
Do you want to create a variable or just do an analysis? In either case, create a format and apply it to all of the variables of concern.
I just wan to do the analysis. What to you mean by creating a format and apply it to all vb?
Take a look at: Base SAS(R) 9.2 Procedures Guide
You would just use 'a'=1
other=0
in creating the format, then apply it like they do in the example.
Sorry, but this does not do what I need...I need the sum of frequencies separately for "b", ,"c"...
I mean I need:
A="b"+ B="b"+ C="b" + D="b"
and
A="c"+ B="c"+ C="c" + D="c"
and
A="d"+ B="d"+ C="d" + D="d"
separately,
and sorry about my first statement it is actually when any of the variables A or B or C or D is ="a" (not all at the same time like I said above)
thank you so much!
Would something like the following do what you want?:
data have (drop=all);
input (a b c d) ($);
all=catt(a,b,c,d);
count_a=count(all,'a','i');
count_b=count(all,'b','i');
count_c=count(all,'c','i');
count_d=count(all,'d','i');
cards;
a a b a
b b b b
c b c a
d a b b
;
Tried and it doesn't work.
Maybe if I explain what exactly I am trying to calculate: I have data that contains medical records on patients and they have main diagnostic + 3 other diagnostics. these are my A, B, C, D variables. they have values from ICD-10. I want to find out how many cases of each of the major condition from ICD-10 happens ( frequencies for the b,c,d,e,....up to 100 conditions) when people have as one of the diagnostic alcohol use. I mean, I need to calculate the frequencies for the comorbidities of alcohol use, by geographic region.
I used in the past something very basic but it is for one condition at the time and that will imply re-running the same code for each comorbidity of the each medical condition of interest (couple of hundred times).
Arthur solution is the correct methodology.
Step 1 - Identify patients who have alcohol use (flag=1 for alcohol use, flag=0 for no alcohol use)
Step 2 - Transpose data (proc transpose)
Step 3 - Run a proc freq on the transpose data- diagnosis by flag ( should probably also run a chi-sq to check if its different between patients who don't have alcohol abuse).
i think will work this way...just need to do a little data prep b4.
thank you!
Here is one approach:
/* Just making some test data */ data have ; input (a b c d) ($); cards; a a a a a b c d b b b b c b c a c c c c d b b b c c c c ; /* Create an informat for the patterns you want to count */ proc format; invalue $allonechar 'aaaa'="a" 'bbbb'="b" 'cccc'="c" 'dddd'="d" 'eeee'="e" 'ffff'="f" other="*" ; run; /* Use the informat in PROC SQL to get your counts in a single pass */ proc sql; select INPUT(CATT(a,b,c,d),$allonechar.) "All Characters" as AllChar , Count(*) as Count from have group by AllChar having AllChar <> "*" order by AllChar ; quit;
Output:
A | 1 |
B | 1 |
C | 2 |
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.