SAS Programming

kevsma · Posted 07-08-2021 06:10 PM

Hello folks, I need to compare multiple list of IDs by dates and record distinct count of IDs in proc sql union function. Below are my codes:

proc sql;
select count(distinct ID) as cnt
from(
select ID
from test(where=(DATE="2019/10"))
UNION 
select ID
from test(where=(DATE="2019/11"))
UNION
select ID
from test(where=(DATE="2019/12"))
UNION
select ID
from test(where=(DATE="2020/01"))
.../*with more dates*/
);
quit;

Basically, I only need to count distinct IDs every time I "union" another date if that makes sense. Currently, my code only spits out the total distinct count for all the dates I am combining, for instance, after running the code above, SAS will have a single number in the output. But I want to have the distinct count each time I am adding in a date, for example, starting with the distinct count of IDs when date = 2019/09, I will have another count when date=2019/10 was added in, another count when date=2019/11 added in... which goes on until my last date value=2020/05. Is there a solution for my question? Thanks so much!

kevsma · Posted 07-09-2021 02:25 PM

I fixed the issue, there should have a parenthesis after clear. Now the following codes seem to work perfectly:

proc sort data=want1; by VAR1 DATE; run;
data TEST; 
	if _N_=1 then do;
	dcl hash H(dataset:'want1(obs=0)');
	H.definekey('ID');
	H.definedone();
    end;
set want1;
  by VAR1 DATE;
  if first.VAR1 then H.clear();
  if first.DATE then sum=0;
  if H.check() then do;
  	sum + 1;
	H.add();
  end;
  if last.DATE then output;
  keep VAR1 DATE sum;
run;

View solution in original post

ballardw · Posted 07-08-2021 06:26 PM

I don't think it is quite clear how you expect the result to look.

Can you show us an example of how the result looks? Best is a small example of the starting data and then the result.

kevsma · Posted 07-09-2021 12:26 PM

Hi thank you and this's a great question, and I don't have a very clear sense as what SAS's output would look like. I am expecting to have SAS record the distinct count of IDs in a time-series fashion. One key information I forgot to mention is that the IDs by dates are accumulative, meaning at the starting point when date=2019/10, the IDs will all be included when the date moves onto the next month 2019/11. For example, if there are 20 distinct IDs when date=2019/10, SAS can create a variable that captures this number when date=2019/10; and when date=2019/11, there are 15 distinct IDs, but 13 of the 15 IDs are not distinct from the IDs when date=2019/10, only 2 IDs are newly added, so I want SAS to capture 17 as the total distinct count of IDs when date=2019/11. Hoping i am explaining this clearly...

ChrisNZ · Posted 07-08-2021 08:48 PM

Like this?

select DATE, count(distinct ID) as CNT
from test where DATE in("2019/10","2019/11","2019/12")
group by DATE

High-Performance SAS Coding - Third Edition

kevsma · Posted 07-09-2021 12:30 PM

Thanks for your reply. Your method will give me the distinct count of IDs by date but this is not what I want exactly. What I want is the distinct count of IDs accumulatively if that makes sense. So whatever the IDs have been counted in a previous dates will be counted ONLY once and only the newly added IDs by dates will be added to the count. This requires a comparison of IDs between two adjacent dates I believe. Let me know whether I explain this clearly.

kevsma · Posted 07-09-2021 01:13 PM

I found an old discussion thread @ChrisNZ you was involved which answers my question perfectly!!!

https://communities.sas.com/t5/SAS-Programming/SAS-cumulative-count-by-unique-ID-and-date/td-p/65304...

if you don't mind, may I ask how to set up a looping macro based on following your code? Say if I have to run your following code based on another variable (VAR1) in the HAVE dataset. Basically I need to run your following code multiple times when VAR1=1,2,3,4,5...

Is there a way to do that?

data WANT;
  if _N_=1 then do;
   dcl hash H(dataset:'HAVE(obs=0)');
   H.definekey('ID');
   H.definedone();
  end;
  set HAVE;
  by DATE;    
  if first.DATE then SUM=0;
  if H.check() then do;
    SUM+1;
    H.add();
  end;
  if last.DATE then output;
  keep DATE SUM;
run;

PGStats · Posted 07-09-2021 02:04 PM

This, maybe?

data WANT;
  if _N_=1 then do;
   dcl hash H(dataset:'HAVE(obs=0)');
   H.definekey('ID');
   H.definedone();
  end;
  set HAVE;
  by VAR1 DATE;    
  if first.VAR1 then H.clear;
  if first.DATE then SUM=0;
  if H.check() then do;
    SUM+1;
    H.add();
  end;
  if last.DATE then output;
  keep VAR1 DATE SUM;
run;

PG

kevsma · Posted 07-09-2021 02:19 PM

Thank you for the reply! I ran the code but SAS spit out error messages, any ideas?

669    if first.VAR1 then H.clear;
                          --------
                          559    22
                                 76
ERROR 559-185: Invalid object attribute reference H.clear.

ERROR 22-322: Syntax error, expecting one of the following: (, +, =.

ERROR 76-322: Syntax error, statement will be ignored.

670    if first.DATE then sum=0;
671    if H.check() then do;
ERROR: DATA STEP Component Object failure.  Aborted during the COMPILATION phase.

kevsma · Posted 07-09-2021 02:25 PM

I fixed the issue, there should have a parenthesis after clear. Now the following codes seem to work perfectly:

proc sort data=want1; by VAR1 DATE; run;
data TEST; 
	if _N_=1 then do;
	dcl hash H(dataset:'want1(obs=0)');
	H.definekey('ID');
	H.definedone();
    end;
set want1;
  by VAR1 DATE;
  if first.VAR1 then H.clear();
  if first.DATE then sum=0;
  if H.check() then do;
  	sum + 1;
	H.add();
  end;
  if last.DATE then output;
  keep VAR1 DATE sum;
run;

SAS Programming

Record distinct count in proc sql union

Re: Record distinct count in proc sql union

Re: Record distinct count in proc sql union

Re: Record distinct count in proc sql union

Re: Record distinct count in proc sql union

Re: Record distinct count in proc sql union

Re: Record distinct count in proc sql union

Re: Record distinct count in proc sql union

Re: Record distinct count in proc sql union

Re: Record distinct count in proc sql union

Follow Us

What is...

SAS Programming

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...