DATA Step, Macro, Functions and more

How to - Get Count of More than One Condition

Accepted Solution Solved
Reply
Super Contributor
Posts: 381
Accepted Solution

How to - Get Count of More than One Condition

Hello,

 

I have a sample data set as below. I want to perform more than one condition in one column. I think I need to use PROC SQL but I just can't handle it. My desired output is at the end of this message.

 

Can someone help me, please?

 

Data Have;
Length Date 8 Turnover 8 Customer_ID 8;
Infile Datalines Missover;
Input Date Turnover Customer_ID;
Format Date Date9.;
Datalines;
20393 0 001
20423 100 002
20454 1000 003
20485 100 004
20514 100 005
20545 10 006
20575 0 007
20636 1000 008
20667 100 009
20698 100 010
20728 10 011
20759 100 012
20789 1000 013
20820 1000 014
20851 0 015
20879 10 016
20910 100 017
20940 1000 018
20971 100 019
21001 10 020
;
Run;

Out.png

Thank you


Accepted Solutions
Solution
‎10-18-2016 03:20 AM
Super User
Super User
Posts: 7,432

Re: How to - Get Count of More than One Condition

Well, this works.  Basically createa category, then count by that:

proc sql;
  create table WANT as 
  select  CAT,
          count(distinct CUSTOMER_ID) as RESULT
  from    (select case  when TURNOVER = 0 then "Equal 0 ->"
                        when 0 < TURNOVER <=10 then "0> And <=10"
                        when 10 < TURNOVER <=100 then "10> And <=100"
                        when 100 < TURNOVER <=1000 then "100> And <=1000"
                        else "" end as CAT,
                   CUSTOMER_ID
           from    HAVE)
  group by CAT;
quit;

View solution in original post


All Replies
Super User
Posts: 17,963

Re: How to - Get Count of More than One Condition

Use a PROC FORMAT + FREQ. 

 

The first example in this paper is similar to your example. 

 

http://www2.sas.com/proceedings/sugi30/001-30.pdf

Solution
‎10-18-2016 03:20 AM
Super User
Super User
Posts: 7,432

Re: How to - Get Count of More than One Condition

Well, this works.  Basically createa category, then count by that:

proc sql;
  create table WANT as 
  select  CAT,
          count(distinct CUSTOMER_ID) as RESULT
  from    (select case  when TURNOVER = 0 then "Equal 0 ->"
                        when 0 < TURNOVER <=10 then "0> And <=10"
                        when 10 < TURNOVER <=100 then "10> And <=100"
                        when 100 < TURNOVER <=1000 then "100> And <=1000"
                        else "" end as CAT,
                   CUSTOMER_ID
           from    HAVE)
  group by CAT;
quit;
Super User
Posts: 6,982

Re: How to - Get Count of More than One Condition

Format and proc freq. You only need to dapt the proc format to make changes:

proc format;
value turnover
  0 = 'Equal 0'
  0 <- 10 = '0> and <=10'
  10 <- 100 = '10> and <=100'
  100 <- 1000 = '100> and <=1000'
;
run;

data have;
length date 8 turnover 8 customer_ID 8;
infile datalines missover;
input date turnover customer_ID;
format
  date date9.
  turnover turnover.
;
datalines;
;
run;

proc freq data=have;
tables turnover /out=want (drop=percent) nopercent nocum;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super Contributor
Posts: 381

Re: How to - Get Count of More than One Condition

Thank you all of them,

 

I know I got my response but what if I want to add cumulative and percentage columns by using PROC SQL, how can I do this?

 

Here is my desired output,

 

Out2.png

 

Thank you

Super User
Posts: 17,963

Re: How to - Get Count of More than One Condition

Use PROC FREQ instead, it's part of the default output. Doing running totals is possible but a pain in SQL.
Super Contributor
Posts: 381

Re: How to - Get Count of More than One Condition

Yes, thank you very much but I just wonder how can I do it by using PROC SQL Smiley Happy

Super User
Super User
Posts: 7,432

Re: How to - Get Count of More than One Condition

Well, percentage is pretty easy as that is a calculation.  Cumulative however is not easy, in fact it is very difficult.  The reason is that SQL does not relate code to observations in a particular order.  So for percentage:

proc sql;
  create table WANT as 
  select  CAT,
          count(distinct CUSTOMER_ID) as RESULT,
(CALCULATED RESULT / (select count(disinct CUSTOMER_ID from HAVE)) * 100 as PCENT from (select case when TURNOVER = 0 then "Equal 0 ->" when 0 < TURNOVER <=10 then "0> And <=10" when 10 < TURNOVER <=100 then "10> And <=100" when 100 < TURNOVER <=1000 then "100> And <=1000" else "" end as CAT, CUSTOMER_ID from HAVE) group by CAT; quit;

And then for cumulative, I would just add a datastep after that, sorted the way you want, and retain the cumulative as you go.  Doing it in teh SQL, the simplest way would just be to select each of the four categories, then union all together - i.e. the above code would be multipled by four rather than having the cat and grouping by that. 

Super Contributor
Posts: 381

Re: How to - Get Count of More than One Condition

Thank you very much.

 

On the other hand, for Proc tabulate procedure which statement should I add for cumulative calculation column?

 

For example, following procedure->

 

PROC TABULATE DATA=HAVE;	
CLASS Turnover /ORDER=UNFORMATTED MISSING;
TABLE /* Row Dimension */Turnover,
/* Column Dimension */N ColPctN ;
RUN;

Thank you

Super User
Posts: 10,552

Re: How to - Get Count of More than One Condition

Proc tabulate does not do a running cumulative. It will do a TOTAL.

Super Contributor
Posts: 381

Re: How to - Get Count of More than One Condition

Okay, so here is my final question, what if I want to create following table which procedure should I use it? 

 

Des.png

 

Thank you

Super User
Posts: 17,963

Re: How to - Get Count of More than One Condition

PROC FREQ is 3 lines of code...

 

Your table is the definition of the output.

proc freq data=have noprint;
tables turnover /out=want outpct outcum ;
run;

proc print data=want label noobs;
run;

 

Super User
Posts: 6,982

Re: How to - Get Count of More than One Condition


turcay wrote:

Okay, so here is my final question, what if I want to create following table which procedure should I use it? 

 

Des.png

 

Thank you


Just omit the nopercent and nocum option in the tables statement, and remove the drop=percent dataset option. You then get the default output of proc freq.

 

PS you really need to develop a knack for reading the SAS documentation. Google "sas proc freq tables", and you get the correct page; all the options are thoroughly described there.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 459 views
  • 4 likes
  • 5 in conversation