BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Navyblue222
Calcite | Level 5

Hello. I have a dataset looks like the one provided below and need to count (not distinct) values of id2  by id1 through time (runningcount is the variable that I want). I have read the existing discussion on this topic (https://communities.sas.com/t5/SAS-Programming/Running-count-distinct-by-group-through-time/td-p/438...) and the 2 solutions works well for the original question. However, I need to restart the count when id2 changes (while keeping time in descending order). Data is sorted by id1 and time. Thank you in advance.  @Ksharp 

 

id1 id2 time runningcount
1 A 20131128 1
1 A 20140530 2
1 A 20150217 3
1 B 20150410 1
1 B 20160612 2
1 B 20180101 3
1 C 20190101 1
2 A 19900515 1
2 A 19910522 2
2 B 19920101 1
2 B 19930321 2

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

You essentially want a consecutive sequence number, reset to 1 each time a new ID2 is encountered.

 

If the data are sorted by ID1/ID2, then 

data want;
  set have;
  by id1 id2;
  if first.id2 then running_count=1;
  else running_count+1;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

2 REPLIES 2
mkeintz
PROC Star

You essentially want a consecutive sequence number, reset to 1 each time a new ID2 is encountered.

 

If the data are sorted by ID1/ID2, then 

data want;
  set have;
  by id1 id2;
  if first.id2 then running_count=1;
  else running_count+1;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Navyblue222
Calcite | Level 5

Thank you very much @mkeintz for the quick help! Yes your solutions works well. I was stuck on the issue that I had to sort the dataset by id1 and id2 before counting, which would mess up the time sequence. It is easily solved by sorting the dataset by id1, id2 & time. 

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 2712 views
  • 1 like
  • 2 in conversation