DATA Step, Macro, Functions and more

Sumarizing Frequencies

Accepted Solution Solved
Reply
Occasional Contributor DMP
Occasional Contributor
Posts: 13
Accepted Solution

Sumarizing Frequencies

I have 4 variables that have the same values array. I need to count the sum of frequencies of certain values in all 4 variables when all variables are equal "a".

For example:

I have variables A, B, C, D and they can have  a,b,c,d,e,f  values. I need to get the sum of frequencies  "b", "c", "d", "f" over the A, B, C, D when A="a" and B="a" and C="a" and D="a".

Can anybody help?

Thank you.


Accepted Solutions
Solution
‎02-26-2013 12:05 PM
Super User
Posts: 19,772

Re: Sumarizing Frequencies

Arthur solution is the correct methodology.

Step 1 - Identify patients who have alcohol use (flag=1 for alcohol use, flag=0 for no alcohol use)

Step 2 - Transpose data (proc transpose)

Step 3 - Run a proc freq on the transpose data- diagnosis by flag ( should probably also run a chi-sq to check if its different between patients who don't have alcohol abuse).

View solution in original post


All Replies
PROC Star
Posts: 7,468

Re: Sumarizing Frequencies

Do you want to create a variable or just do an analysis?  In either case, create a format and apply it to all of the variables of concern.

Occasional Contributor DMP
Occasional Contributor
Posts: 13

Re: Sumarizing Frequencies


I just wan to do the analysis. What to you mean by creating a format and apply it to all vb?

PROC Star
Posts: 7,468

Re: Sumarizing Frequencies

Take a look at: Base SAS(R) 9.2 Procedures Guide

You would just use 'a'=1

                           other=0

in creating the format, then apply it like they do in the example.

Occasional Contributor DMP
Occasional Contributor
Posts: 13

Re: Sumarizing Frequencies

Sorry, but this does not do what I need...I need the sum of frequencies separately for "b", ,"c"...

I mean I need:

A="b"+ B="b"+ C="b" + D="b"

and

A="c"+ B="c"+ C="c" + D="c"

and

A="d"+ B="d"+ C="d" + D="d"

separately,

and sorry about my first statement it is actually when any of the variables A or B or C or D is ="a" (not all at the same time like I said above)

thank you so much!

PROC Star
Posts: 7,468

Re: Sumarizing Frequencies

Would something like the following do what you want?:

data have (drop=all);

  input (a b c d) ($);

  all=catt(a,b,c,d);

  count_a=count(all,'a','i');

  count_b=count(all,'b','i');

  count_c=count(all,'c','i');

  count_d=count(all,'d','i');

  cards;

a a b a

b b b b

c b c a

d a b b

;

Occasional Contributor DMP
Occasional Contributor
Posts: 13

Re: Sumarizing Frequencies

Tried and it doesn't work.

Maybe if I explain what exactly I am trying to calculate: I have data that contains medical records on patients and they have main diagnostic + 3 other diagnostics. these are my A, B, C, D variables. they have values from ICD-10. I want to find out how many cases of each of the major condition from ICD-10 happens ( frequencies for the b,c,d,e,....up to 100 conditions) when people have as one of the diagnostic alcohol use. I mean, I need to calculate the frequencies for the comorbidities of alcohol use, by geographic region.

I used in the past something very basic but it is for one condition at the time and that will imply re-running the same code for each comorbidity of the each medical condition of interest (couple of hundred times).


PROC Star
Posts: 7,468

Re: Sumarizing Frequencies

: Then couldn't you use PROC TRANSPOSE to make the file tall (i.e., one record for each of the four codes, with all of the codes in the same column) and then just do a simple frequency distribution?

Solution
‎02-26-2013 12:05 PM
Super User
Posts: 19,772

Re: Sumarizing Frequencies

Arthur solution is the correct methodology.

Step 1 - Identify patients who have alcohol use (flag=1 for alcohol use, flag=0 for no alcohol use)

Step 2 - Transpose data (proc transpose)

Step 3 - Run a proc freq on the transpose data- diagnosis by flag ( should probably also run a chi-sq to check if its different between patients who don't have alcohol abuse).

Occasional Contributor DMP
Occasional Contributor
Posts: 13

Re: Sumarizing Frequencies

i think will work this way...just need to do a little data prep b4.

thank you!

SAS Employee
Posts: 104

Re: Sumarizing Frequencies

Here is one approach:

/* Just making some test data */
data have ;
  input (a b c d) ($);
  cards;
a a a a
a b c d
b b b b
c b c a
c c c c
d b b b
c c c c
;
/* Create an informat for the patterns you want to count */
proc format;
  invalue $allonechar 
        'aaaa'="a"
        'bbbb'="b"
        'cccc'="c"
        'dddd'="d"
        'eeee'="e"
        'ffff'="f"
        other="*" 
  ;
run;
/* Use the informat in PROC SQL to get your counts in a single pass */
proc sql;
  select INPUT(CATT(a,b,c,d),$allonechar.) "All Characters" as AllChar
       , Count(*) as Count
    from have
    group by AllChar
    having AllChar <> "*"
    order by AllChar
    ;
quit;

Output:

The SAS System
A1
B1
C2

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 292 views
  • 6 likes
  • 4 in conversation