BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kevsma
Quartz | Level 8

Hello folks, I need to compare multiple list of IDs by dates and record distinct count of IDs in proc sql union function. Below are my codes:

proc sql;
select count(distinct ID) as cnt
from(
select ID
from test(where=(DATE="2019/10"))
UNION 
select ID
from test(where=(DATE="2019/11"))
UNION
select ID
from test(where=(DATE="2019/12"))
UNION
select ID
from test(where=(DATE="2020/01"))
.../*with more dates*/
);
quit;

Basically, I only need to count distinct IDs every time I "union" another date if that makes sense. Currently, my code only spits out the total distinct count for all the dates I am combining, for instance, after running the code above, SAS will have a single number in the output. But I want to have the distinct count each time I am adding in a date, for example, starting with the distinct count of IDs when date = 2019/09, I will have another count when date=2019/10 was added in, another count when date=2019/11 added in... which goes on until my last date value=2020/05.  Is there a solution for my question? Thanks so much!

1 ACCEPTED SOLUTION

Accepted Solutions
kevsma
Quartz | Level 8

I fixed the issue, there should have a parenthesis after clear. Now the following codes seem to work perfectly:

proc sort data=want1; by VAR1 DATE; run;
data TEST; 
	if _N_=1 then do;
	dcl hash H(dataset:'want1(obs=0)');
	H.definekey('ID');
	H.definedone();
    end;
set want1;
  by VAR1 DATE;
  if first.VAR1 then H.clear();
  if first.DATE then sum=0;
  if H.check() then do;
  	sum + 1;
	H.add();
  end;
  if last.DATE then output;
  keep VAR1 DATE sum;
run;

View solution in original post

8 REPLIES 8
ballardw
Super User

I don't think it is quite clear how you expect the result to look.

Can you show us an example of how the result looks? Best is a small example of the starting data and then the result.

kevsma
Quartz | Level 8

Hi thank you and this's a great question, and I don't have a very clear sense as what SAS's output would look like. I am expecting to have SAS record the distinct count of IDs in a time-series fashion. One key information I forgot to mention is that the IDs by dates are accumulative, meaning at the starting point when date=2019/10, the IDs will all be included when the date moves onto the next month 2019/11. For example, if there are 20 distinct IDs when date=2019/10, SAS can create a variable that captures this number when date=2019/10; and when date=2019/11, there are 15 distinct IDs, but 13 of the 15 IDs are not distinct from the IDs when date=2019/10, only 2 IDs are newly added, so I want SAS to capture 17 as the total distinct count of IDs when date=2019/11. Hoping i am explaining this clearly... 

ChrisNZ
Tourmaline | Level 20

Like this?

select DATE, count(distinct ID) as CNT
from test where DATE in("2019/10","2019/11","2019/12")
group by DATE

 

kevsma
Quartz | Level 8

Thanks for your reply. Your method will give me the distinct count of IDs by date but this is not what I want exactly. What I want is the distinct count of IDs accumulatively if that makes sense. So whatever the IDs have been counted in a previous dates will be counted ONLY once and only the newly added IDs by dates will be added to the count. This requires a comparison of IDs between two adjacent dates I believe. Let me know whether I explain this clearly. 

kevsma
Quartz | Level 8

I found an old discussion thread @ChrisNZ you was involved which answers my question perfectly!!! 

https://communities.sas.com/t5/SAS-Programming/SAS-cumulative-count-by-unique-ID-and-date/td-p/65304... 

if you don't mind, may I ask how to set up a looping macro based on following your code? Say if I have to run your following code based on another variable (VAR1) in the HAVE dataset. Basically I need to run your following code multiple times when VAR1=1,2,3,4,5...

Is there a way to do that?

data WANT;
  if _N_=1 then do;
   dcl hash H(dataset:'HAVE(obs=0)');
   H.definekey('ID');
   H.definedone();
  end;
  set HAVE;
  by DATE;    
  if first.DATE then SUM=0;
  if H.check() then do;
    SUM+1;
    H.add();
  end;
  if last.DATE then output;
  keep DATE SUM;
run;
PGStats
Opal | Level 21

This, maybe?

 

data WANT;
  if _N_=1 then do;
   dcl hash H(dataset:'HAVE(obs=0)');
   H.definekey('ID');
   H.definedone();
  end;
  set HAVE;
  by VAR1 DATE;    
  if first.VAR1 then H.clear;
  if first.DATE then SUM=0;
  if H.check() then do;
    SUM+1;
    H.add();
  end;
  if last.DATE then output;
  keep VAR1 DATE SUM;
run;
PG
kevsma
Quartz | Level 8

Thank you for the reply! I ran the code but SAS spit out error messages, any ideas?

669    if first.VAR1 then H.clear;
                          --------
                          559    22
                                 76
ERROR 559-185: Invalid object attribute reference H.clear.

ERROR 22-322: Syntax error, expecting one of the following: (, +, =.

ERROR 76-322: Syntax error, statement will be ignored.

670    if first.DATE then sum=0;
671    if H.check() then do;
ERROR: DATA STEP Component Object failure.  Aborted during the COMPILATION phase.

 

kevsma
Quartz | Level 8

I fixed the issue, there should have a parenthesis after clear. Now the following codes seem to work perfectly:

proc sort data=want1; by VAR1 DATE; run;
data TEST; 
	if _N_=1 then do;
	dcl hash H(dataset:'want1(obs=0)');
	H.definekey('ID');
	H.definedone();
    end;
set want1;
  by VAR1 DATE;
  if first.VAR1 then H.clear();
  if first.DATE then sum=0;
  if H.check() then do;
  	sum + 1;
	H.add();
  end;
  if last.DATE then output;
  keep VAR1 DATE sum;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 968 views
  • 0 likes
  • 4 in conversation