How to conditionally count observations of variable into several different new variables with sql

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 17
Accepted Solution

How to conditionally count observations of variable into several different new variables with sql

Dear all,

 

In my dataset, I have a varialbe with the observation values between 0 and 1, I want to count how many observations of this variable is blow 0.2, how many between 0.2 and 0.5, how many over 0.5. Can I have this done in one sql procedure? If yes, How?

 

Thanks


Accepted Solutions
Solution
‎08-18-2016 02:55 AM
Super User
Posts: 18,997

Re: How to conditionally count observations of variable into several different new variables with

Select

sum(var<0.2) as lt_2,

sum(var>=0.2 and var <=0.5) as bet_2_5,

etc..

from have;

 

Yes you can. Sum the condition, if the condition is true it resolves to 1, 0 otherwise. 

 

View solution in original post


All Replies
Senior User
Posts: 1

Re: How to conditionally count observations of variable into several different new variables with

You could use

1.)the retain function in the data step and count the values using a series of if conditions or

2.)In proc sql ,probably use the Union option  between three individual queries to obtain  the counts with the help of group by statments

Solution
‎08-18-2016 02:55 AM
Super User
Posts: 18,997

Re: How to conditionally count observations of variable into several different new variables with

Select

sum(var<0.2) as lt_2,

sum(var>=0.2 and var <=0.5) as bet_2_5,

etc..

from have;

 

Yes you can. Sum the condition, if the condition is true it resolves to 1, 0 otherwise. 

 

Occasional Contributor
Posts: 17

Re: How to conditionally count observations of variable into several different new variables with

Thank you very much Reeza! It works!
Super User
Posts: 7,378

Re: How to conditionally count observations of variable into several different new variables with

 

data want (keep=count_a count_b count_c);
set have end=done;
retain count_a count_a count_c 0;
if var < 0.2 then count_a + 1;
else if var <= 0.5 then count_b + 1;
else count_c + 1;
if done then output;
run;

Another method is to create a format, so the formatted values can be used in a simple freq:

proc format library=work;
value testfmt
  low-<0.2 = 1
  0.2-0.5 = 2
  0.5<-high = 3
;
run;

data have;
input var;
format var testfmt.;
cards;
.1
.3
.6
.4
.5
.2
.7
;
run;

proc freq data=have order=internal;
tables var / nopercent nocum;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 398 views
  • 5 likes
  • 4 in conversation