Hi everyone, I've got another question.
Here's what I have
ID date event-date
10158 07FEB2014 22SEP2014
10158 10FEB2014 22SEP2014
10158 17MAY2016 12FEB2016
10158 18MAY2016 12FEB2016
20154 10APR2017 25SEP2017
20154 11APR2017 25SEP2017
20154 12APR2017 25SEP2017
And here's what I want.
ID date evtdate count
10158 07FEB2014 22SEP2014 1
10158 10FEB2014 22SEP2014 2
10158 17MAY2016 12FEB2016 1
10158 18MAY2016 12FEB2016 2
20154 10APR2017 25SEP2017 1
20154 11APR2017 25SEP2017 2
20154 12APR2017 25SEP2017 3
Basically I'd like to count the number of dates for each event date by ID. And if the number of dates for each event date is less than 120 days then I want to delete the ID.
Thanks!
This method does not require your data to be sorted in any way
data have;
input ID (date eventdate)(:date9.);
format date eventdate date9.;
datalines;
10158 07FEB2014 22SEP2014
10158 10FEB2014 22SEP2014
10158 17MAY2016 12FEB2016
10158 18MAY2016 12FEB2016
20154 10APR2017 25SEP2017
20154 11APR2017 25SEP2017
20154 12APR2017 25SEP2017
;
data want;
if _n_=1 then do;
dcl hash h();
h.defineKey ("ID", "eventdate");
h.defineData ("count");
h.defineDone ();
end;
set have;
if h.find() ne 0 then count = 1;
else count + 1;
h.replace();
run;
Result:
ID date eventdate count 10158 07FEB2014 22SEP2014 1 10158 10FEB2014 22SEP2014 2 10158 17MAY2016 12FEB2016 1 10158 18MAY2016 12FEB2016 2 20154 10APR2017 25SEP2017 1 20154 11APR2017 25SEP2017 2 20154 12APR2017 25SEP2017 3
This method does not require your data to be sorted in any way
data have;
input ID (date eventdate)(:date9.);
format date eventdate date9.;
datalines;
10158 07FEB2014 22SEP2014
10158 10FEB2014 22SEP2014
10158 17MAY2016 12FEB2016
10158 18MAY2016 12FEB2016
20154 10APR2017 25SEP2017
20154 11APR2017 25SEP2017
20154 12APR2017 25SEP2017
;
data want;
if _n_=1 then do;
dcl hash h();
h.defineKey ("ID", "eventdate");
h.defineData ("count");
h.defineDone ();
end;
set have;
if h.find() ne 0 then count = 1;
else count + 1;
h.replace();
run;
Result:
ID date eventdate count 10158 07FEB2014 22SEP2014 1 10158 10FEB2014 22SEP2014 2 10158 17MAY2016 12FEB2016 1 10158 18MAY2016 12FEB2016 2 20154 10APR2017 25SEP2017 1 20154 11APR2017 25SEP2017 2 20154 12APR2017 25SEP2017 3
Yep that works. Thanks a lot, I really appreciate it!
What do you think I should do if I want to get rid of the ID that have less than 120 for a given event date?
No problem. You want to exclude an ID if it for any date has less than 120 observations, correct? So if it has 1 obs for some date and 121 obs for another date, you do not want to include the ID at all, correct?
Alright so I want to keep the ID if it has at least 120 observation for a given event date.
So if one ID has 1 obs for one date but 121 obs for another date I only want to delete it where it has 1 obs but keep it where it has 121 (or more).
Is it clearer?
Much clearer. Also, you still want the count variable in there, correct?
Yes that's right 🙂
You could just run this after the code I posted
proc sql;
create table want as
select * from want
group by ID, eventdate
having max(count) ge 121
order by ID, eventdate, count;
quit;
Excellent that works 🙂
Thanks so much for you help!
@Antoine44 wrote:
Yep that works. Thanks a lot, I really appreciate it!
What do you think I should do if I want to get rid of the ID that have less than 120 for a given event date?
Add this to @PeterClemmensen's code:
proc sql;
create table check as
select eventdate, max(count) as maxcount
from want
group by eventdate
having maxcount >= 120;
quit;
data real_want;
if _n_ = 1
then do;
declare hash h(dataset:"check (keep=eventdate)");
h.definekey("eventdate");
h.definedone();
end;
set want;
if h.find() = 0;
run;
Your data looks sorted?
data have;
input ID (date eventdate)(:date9.);
format date eventdate date9.;
datalines;
10158 07FEB2014 22SEP2014
10158 10FEB2014 22SEP2014
10158 17MAY2016 12FEB2016
10158 18MAY2016 12FEB2016
20154 10APR2017 25SEP2017
20154 11APR2017 25SEP2017
20154 12APR2017 25SEP2017
;
data want;
set have;
by id eventdate;
if first.eventdate then count=1;
else count+1;
run;
yep my data is sorted so that works too 🙂
Thanks!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.