Solved: Running count by group through time

Navyblue222 · Posted 05-13-2022 02:58 PM

Hello. I have a dataset looks like the one provided below and need to count (not distinct) values of id2 by id1 through time (runningcount is the variable that I want). I have read the existing discussion on this topic (https://communities.sas.com/t5/SAS-Programming/Running-count-distinct-by-group-through-time/td-p/438...) and the 2 solutions works well for the original question. However, I need to restart the count when id2 changes (while keeping time in descending order). Data is sorted by id1 and time. Thank you in advance. @Ksharp

id1 id2 time runningcount
1 A 20131128 1
1 A 20140530 2
1 A 20150217 3
1 B 20150410 1
1 B 20160612 2
1 B 20180101 3
1 C 20190101 1
2 A 19900515 1
2 A 19910522 2
2 B 19920101 1
2 B 19930321 2

mkeintz · Posted 05-13-2022 03:05 PM

You essentially want a consecutive sequence number, reset to 1 each time a new ID2 is encountered.

If the data are sorted by ID1/ID2, then

data want;
  set have;
  by id1 id2;
  if first.id2 then running_count=1;
  else running_count+1;
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

mkeintz · Posted 05-13-2022 03:05 PM

You essentially want a consecutive sequence number, reset to 1 each time a new ID2 is encountered.

If the data are sorted by ID1/ID2, then

data want;
  set have;
  by id1 id2;
  if first.id2 then running_count=1;
  else running_count+1;
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Navyblue222 · Posted 05-13-2022 04:09 PM

Thank you very much @mkeintz for the quick help! Yes your solutions works well. I was stuck on the issue that I had to sort the dataset by id1 and id2 before counting, which would mess up the time sequence. It is easily solved by sorting the dataset by id1, id2 & time.

Running count by group through time

Re: Running count by group through time

Re: Running count by group through time

Re: Running count by group through time

Catch up on SAS Innovate 2026

Running count by group through time

Re: Running count by group through time

Re: Running count by group through time

Re: Running count by group through time

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away