BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lezgin
Obsidian | Level 7

Hi, I have below dataset. I need to count distinct values of id2 by id1 through time. The output I want is the runningcount1 column. Additionally, I need to repeat the same procedure for the last 360 days. Checking last 360 days and counting distinct number of id2 by id1. If it goes less than 360 days back, then it should count distinct values of id2 by id1 however many days it goes back. The output I want is in runningcount2 column. Data is sorted by id1 and time. Thank you very much.

 

id1 id2 time   runningcount1 runningcount2
1    A  20131128   1 1
1    B  20140214   2 2
1    C  20140530   3 3
1    C  20140622   3 3
1    D  20140831   4 4
1    D  20141220   4 3
1    A  20150217   4 3
1    A  20150302   4 3
1    C  20150410   4 3
1    D  20150425   4 3
1    E  20150609   5 4
1    F  20151025   6 5
1    C  20160612   6 2
1    D  20180101   6 1
2    A  19900515   1 1
2    B  19900813   2 2
2    E  19910522   3 2
2    A  19910524   3 3
2    F  19910919   4 3
2    G  19920101   5 4
2    A  19930321   5 1

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

You can do it with a hash table, e.g.:

data have;
  input id1 @6 id2 $ @9 time yymmdd8.;
  format time date9.;
cards;
1    A  20131128
1    B  20140214
1    C  20140530
1    C  20140622
1    D  20140831
1    D  20141220
1    A  20150217
1    A  20150302
1    C  20150410
1    D  20150425
1    E  20150609
1    F  20151025
1    C  20160612
1    D  20180101
2    A  19900515
2    B  19900813
2    E  19910522
2    A  19910524
2    F  19910919
2    G  19920101
2    A  19930321
;run;



data want;
  set have;
  by id1;
  if first.id1 then do;
    if _N_=1 then do;
      declare hash h();
      rc=h.definedata('lasttime');
      rc=h.definekey('id2');
      h.definedone();
      declare hiter iter('h');
      end;
    else
      h.clear();
    end;
  rc=h.find();
  lasttime=time;
  if rc then
    h.add();
  else
    h.replace();
  runningcount1=h.num_items;
  rc=iter.first();
  starttime=intnx('year',time,-1,'same');
  runningcount2=0;
  do until(iter.next());
    if lasttime>=starttime then
      runningcount2=runningcount2+1;
    end;
  keep id1 id2 time runningcount1 runningcount2;
run;

View solution in original post

4 REPLIES 4
andreas_lds
Jade | Level 19

Please post the contents of the excel-file as data-step using datalines statement, so that we don't have to guess how the data looked like when you imported the file.

s_lassen
Meteorite | Level 14

You can do it with a hash table, e.g.:

data have;
  input id1 @6 id2 $ @9 time yymmdd8.;
  format time date9.;
cards;
1    A  20131128
1    B  20140214
1    C  20140530
1    C  20140622
1    D  20140831
1    D  20141220
1    A  20150217
1    A  20150302
1    C  20150410
1    D  20150425
1    E  20150609
1    F  20151025
1    C  20160612
1    D  20180101
2    A  19900515
2    B  19900813
2    E  19910522
2    A  19910524
2    F  19910919
2    G  19920101
2    A  19930321
;run;



data want;
  set have;
  by id1;
  if first.id1 then do;
    if _N_=1 then do;
      declare hash h();
      rc=h.definedata('lasttime');
      rc=h.definekey('id2');
      h.definedone();
      declare hiter iter('h');
      end;
    else
      h.clear();
    end;
  rc=h.find();
  lasttime=time;
  if rc then
    h.add();
  else
    h.replace();
  runningcount1=h.num_items;
  rc=iter.first();
  starttime=intnx('year',time,-1,'same');
  runningcount2=0;
  do until(iter.next());
    if lasttime>=starttime then
      runningcount2=runningcount2+1;
    end;
  keep id1 id2 time runningcount1 runningcount2;
run;
lezgin
Obsidian | Level 7

Thank you very much s_lassen. This was very helpful.

Ksharp
Super User
data have;
  input id1 @6 id2 $ @9 time yymmdd8.;
  format time date9.;
cards;
1    A  20131128
1    B  20140214
1    C  20140530
1    C  20140622
1    D  20140831
1    D  20141220
1    A  20150217
1    A  20150302
1    C  20150410
1    D  20150425
1    E  20150609
1    F  20151025
1    C  20160612
1    D  20180101
2    A  19900515
2    B  19900813
2    E  19910522
2    A  19910524
2    F  19910919
2    G  19920101
2    A  19930321
;run;
proc sql;
select *,(select count(distinct id2) from have where id1=a.id1 and time le a.time) as count1,
(select count(distinct id2) from have where id1=a.id1 and time between a.time-360 and a.time) as count2
 from have as a;
quit;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 2664 views
  • 4 likes
  • 4 in conversation