I have 4 variables that have the same values array. I need to count the sum of frequencies of certain values in all 4 variables when all variables are equal "a".
For example:
I have variables A, B, C, D and they can have a,b,c,d,e,f values. I need to get the sum of frequencies "b", "c", "d", "f" over the A, B, C, D when A="a" and B="a" and C="a" and D="a".
Can anybody help?
Thank you.
Arthur solution is the correct methodology.
Step 1 - Identify patients who have alcohol use (flag=1 for alcohol use, flag=0 for no alcohol use)
Step 2 - Transpose data (proc transpose)
Step 3 - Run a proc freq on the transpose data- diagnosis by flag ( should probably also run a chi-sq to check if its different between patients who don't have alcohol abuse).
Do you want to create a variable or just do an analysis? In either case, create a format and apply it to all of the variables of concern.
I just wan to do the analysis. What to you mean by creating a format and apply it to all vb?
Take a look at: Base SAS(R) 9.2 Procedures Guide
You would just use 'a'=1
other=0
in creating the format, then apply it like they do in the example.
Sorry, but this does not do what I need...I need the sum of frequencies separately for "b", ,"c"...
I mean I need:
A="b"+ B="b"+ C="b" + D="b"
and
A="c"+ B="c"+ C="c" + D="c"
and
A="d"+ B="d"+ C="d" + D="d"
separately,
and sorry about my first statement it is actually when any of the variables A or B or C or D is ="a" (not all at the same time like I said above)
thank you so much!
Would something like the following do what you want?:
data have (drop=all);
input (a b c d) ($);
all=catt(a,b,c,d);
count_a=count(all,'a','i');
count_b=count(all,'b','i');
count_c=count(all,'c','i');
count_d=count(all,'d','i');
cards;
a a b a
b b b b
c b c a
d a b b
;
Tried and it doesn't work.
Maybe if I explain what exactly I am trying to calculate: I have data that contains medical records on patients and they have main diagnostic + 3 other diagnostics. these are my A, B, C, D variables. they have values from ICD-10. I want to find out how many cases of each of the major condition from ICD-10 happens ( frequencies for the b,c,d,e,....up to 100 conditions) when people have as one of the diagnostic alcohol use. I mean, I need to calculate the frequencies for the comorbidities of alcohol use, by geographic region.
I used in the past something very basic but it is for one condition at the time and that will imply re-running the same code for each comorbidity of the each medical condition of interest (couple of hundred times).
Arthur solution is the correct methodology.
Step 1 - Identify patients who have alcohol use (flag=1 for alcohol use, flag=0 for no alcohol use)
Step 2 - Transpose data (proc transpose)
Step 3 - Run a proc freq on the transpose data- diagnosis by flag ( should probably also run a chi-sq to check if its different between patients who don't have alcohol abuse).
i think will work this way...just need to do a little data prep b4.
thank you!
Here is one approach:
/* Just making some test data */ data have ; input (a b c d) ($); cards; a a a a a b c d b b b b c b c a c c c c d b b b c c c c ; /* Create an informat for the patterns you want to count */ proc format; invalue $allonechar 'aaaa'="a" 'bbbb'="b" 'cccc'="c" 'dddd'="d" 'eeee'="e" 'ffff'="f" other="*" ; run; /* Use the informat in PROC SQL to get your counts in a single pass */ proc sql; select INPUT(CATT(a,b,c,d),$allonechar.) "All Characters" as AllChar , Count(*) as Count from have group by AllChar having AllChar <> "*" order by AllChar ; quit;
Output:
A | 1 |
B | 1 |
C | 2 |
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.