BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Antoine44
Fluorite | Level 6

 

Hi everyone, I've got another question.

 

Here's what I have 

 

   ID                     date                event-date

10158            07FEB2014        22SEP2014

10158            10FEB2014        22SEP2014

10158            17MAY2016       12FEB2016 
10158            18MAY2016       12FEB2016

20154            10APR2017        25SEP2017

20154            11APR2017        25SEP2017
20154            12APR2017        25SEP2017

 

And here's what I want.

 

   ID                    date                  evtdate           count

10158            07FEB2014        22SEP2014           1

10158            10FEB2014        22SEP2014           2

10158            17MAY2016       12FEB2016            1
10158            18MAY2016       12FEB2016            2

20154            10APR2017        25SEP2017           1

20154            11APR2017        25SEP2017           2
20154            12APR2017        25SEP2017           3

 

Basically I'd like to count the number of dates for each event date by ID. And if the number of dates for each event date is less than 120 days then I want to delete the ID. 

 

Thanks!

                     

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

This method does not require your data to be sorted in any way

 

data have;
input ID (date eventdate)(:date9.);
format date eventdate date9.;
datalines;
10158 07FEB2014 22SEP2014
10158 10FEB2014 22SEP2014
10158 17MAY2016 12FEB2016 
10158 18MAY2016 12FEB2016
20154 10APR2017 25SEP2017
20154 11APR2017 25SEP2017
20154 12APR2017 25SEP2017
;

data want;
  if _n_=1 then do;
    dcl hash h();
    h.defineKey ("ID", "eventdate");
    h.defineData ("count");
    h.defineDone ();
  end;
 
  set have;

  if h.find() ne 0 then count = 1;
  else                  count + 1;

  h.replace();
run;

 

Result:

 

ID      date        eventdate   count
10158   07FEB2014   22SEP2014   1
10158   10FEB2014   22SEP2014   2
10158   17MAY2016   12FEB2016   1
10158   18MAY2016   12FEB2016   2
20154   10APR2017   25SEP2017   1
20154   11APR2017   25SEP2017   2
20154   12APR2017   25SEP2017   3

 

View solution in original post

11 REPLIES 11
PeterClemmensen
Tourmaline | Level 20

This method does not require your data to be sorted in any way

 

data have;
input ID (date eventdate)(:date9.);
format date eventdate date9.;
datalines;
10158 07FEB2014 22SEP2014
10158 10FEB2014 22SEP2014
10158 17MAY2016 12FEB2016 
10158 18MAY2016 12FEB2016
20154 10APR2017 25SEP2017
20154 11APR2017 25SEP2017
20154 12APR2017 25SEP2017
;

data want;
  if _n_=1 then do;
    dcl hash h();
    h.defineKey ("ID", "eventdate");
    h.defineData ("count");
    h.defineDone ();
  end;
 
  set have;

  if h.find() ne 0 then count = 1;
  else                  count + 1;

  h.replace();
run;

 

Result:

 

ID      date        eventdate   count
10158   07FEB2014   22SEP2014   1
10158   10FEB2014   22SEP2014   2
10158   17MAY2016   12FEB2016   1
10158   18MAY2016   12FEB2016   2
20154   10APR2017   25SEP2017   1
20154   11APR2017   25SEP2017   2
20154   12APR2017   25SEP2017   3

 

Antoine44
Fluorite | Level 6

Yep that works. Thanks a lot, I really appreciate it!

 

What do you think I should do if I want to get rid of the ID that have less than 120 for a given event date? 

PeterClemmensen
Tourmaline | Level 20

No problem. You want to exclude an ID if it for any date has less than 120 observations, correct? So if it has 1 obs for some date and 121 obs for another date, you do not want to include the ID at all, correct?

Antoine44
Fluorite | Level 6

Alright so I want to keep the ID if it has at least 120 observation for a given event date. 

 

So if one ID has 1 obs for one date but 121 obs for another date I only want to delete it where it has 1 obs but keep it where it has 121 (or more).

 

Is it clearer?

 

 

PeterClemmensen
Tourmaline | Level 20

Much clearer. Also, you still want the count variable in there, correct?

Antoine44
Fluorite | Level 6

Yes that's right 🙂

PeterClemmensen
Tourmaline | Level 20

You could just run this after the code I posted

 

proc sql;
    create table want as
    select * from want
    group by ID, eventdate
    having max(count) ge 121
    order by ID, eventdate, count;
quit;
Antoine44
Fluorite | Level 6

Excellent that works 🙂 

 

Thanks so much for you help!

Kurt_Bremser
Super User

@Antoine44 wrote:

Yep that works. Thanks a lot, I really appreciate it!

 

What do you think I should do if I want to get rid of the ID that have less than 120 for a given event date? 


Add this to @PeterClemmensen's code:

proc sql;
create table check as
select eventdate, max(count) as maxcount
from want
group by eventdate
having maxcount >= 120;
quit;

data real_want;
if _n_ = 1
then do;
  declare hash h(dataset:"check (keep=eventdate)");
  h.definekey("eventdate");
  h.definedone();
end;
set want;
if h.find() = 0;
run;
novinosrin
Tourmaline | Level 20

Your data looks sorted?

 

data have;
input ID (date eventdate)(:date9.);
format date eventdate date9.;
datalines;
10158 07FEB2014 22SEP2014
10158 10FEB2014 22SEP2014
10158 17MAY2016 12FEB2016 
10158 18MAY2016 12FEB2016
20154 10APR2017 25SEP2017
20154 11APR2017 25SEP2017
20154 12APR2017 25SEP2017
;

data want;
set have;
by id eventdate;
if first.eventdate then count=1;
else count+1;
run;

 

 

Antoine44
Fluorite | Level 6

yep my data is sorted so that works too 🙂

 

Thanks!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 1342 views
  • 2 likes
  • 4 in conversation