BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DMP
Calcite | Level 5 DMP
Calcite | Level 5

I have 4 variables that have the same values array. I need to count the sum of frequencies of certain values in all 4 variables when all variables are equal "a".

For example:

I have variables A, B, C, D and they can have  a,b,c,d,e,f  values. I need to get the sum of frequencies  "b", "c", "d", "f" over the A, B, C, D when A="a" and B="a" and C="a" and D="a".

Can anybody help?

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Arthur solution is the correct methodology.

Step 1 - Identify patients who have alcohol use (flag=1 for alcohol use, flag=0 for no alcohol use)

Step 2 - Transpose data (proc transpose)

Step 3 - Run a proc freq on the transpose data- diagnosis by flag ( should probably also run a chi-sq to check if its different between patients who don't have alcohol abuse).

View solution in original post

10 REPLIES 10
art297
Opal | Level 21

Do you want to create a variable or just do an analysis?  In either case, create a format and apply it to all of the variables of concern.

DMP
Calcite | Level 5 DMP
Calcite | Level 5


I just wan to do the analysis. What to you mean by creating a format and apply it to all vb?

art297
Opal | Level 21

Take a look at: Base SAS(R) 9.2 Procedures Guide

You would just use 'a'=1

                           other=0

in creating the format, then apply it like they do in the example.

DMP
Calcite | Level 5 DMP
Calcite | Level 5

Sorry, but this does not do what I need...I need the sum of frequencies separately for "b", ,"c"...

I mean I need:

A="b"+ B="b"+ C="b" + D="b"

and

A="c"+ B="c"+ C="c" + D="c"

and

A="d"+ B="d"+ C="d" + D="d"

separately,

and sorry about my first statement it is actually when any of the variables A or B or C or D is ="a" (not all at the same time like I said above)

thank you so much!

art297
Opal | Level 21

Would something like the following do what you want?:

data have (drop=all);

  input (a b c d) ($);

  all=catt(a,b,c,d);

  count_a=count(all,'a','i');

  count_b=count(all,'b','i');

  count_c=count(all,'c','i');

  count_d=count(all,'d','i');

  cards;

a a b a

b b b b

c b c a

d a b b

;

DMP
Calcite | Level 5 DMP
Calcite | Level 5

Tried and it doesn't work.

Maybe if I explain what exactly I am trying to calculate: I have data that contains medical records on patients and they have main diagnostic + 3 other diagnostics. these are my A, B, C, D variables. they have values from ICD-10. I want to find out how many cases of each of the major condition from ICD-10 happens ( frequencies for the b,c,d,e,....up to 100 conditions) when people have as one of the diagnostic alcohol use. I mean, I need to calculate the frequencies for the comorbidities of alcohol use, by geographic region.

I used in the past something very basic but it is for one condition at the time and that will imply re-running the same code for each comorbidity of the each medical condition of interest (couple of hundred times).


art297
Opal | Level 21

: Then couldn't you use PROC TRANSPOSE to make the file tall (i.e., one record for each of the four codes, with all of the codes in the same column) and then just do a simple frequency distribution?

Reeza
Super User

Arthur solution is the correct methodology.

Step 1 - Identify patients who have alcohol use (flag=1 for alcohol use, flag=0 for no alcohol use)

Step 2 - Transpose data (proc transpose)

Step 3 - Run a proc freq on the transpose data- diagnosis by flag ( should probably also run a chi-sq to check if its different between patients who don't have alcohol abuse).

DMP
Calcite | Level 5 DMP
Calcite | Level 5

i think will work this way...just need to do a little data prep b4.

thank you!

SASJedi
SAS Super FREQ

Here is one approach:

/* Just making some test data */
data have ;
  input (a b c d) ($);
  cards;
a a a a
a b c d
b b b b
c b c a
c c c c
d b b b
c c c c
;
/* Create an informat for the patterns you want to count */
proc format;
  invalue $allonechar 
        'aaaa'="a"
        'bbbb'="b"
        'cccc'="c"
        'dddd'="d"
        'eeee'="e"
        'ffff'="f"
        other="*" 
  ;
run;
/* Use the informat in PROC SQL to get your counts in a single pass */
proc sql;
  select INPUT(CATT(a,b,c,d),$allonechar.) "All Characters" as AllChar
       , Count(*) as Count
    from have
    group by AllChar
    having AllChar <> "*"
    order by AllChar
    ;
quit;

Output:

The SAS System
A1
B1
C2

Check out my Jedi SAS Tricks for SAS Users

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2052 views
  • 6 likes
  • 4 in conversation