Hello. I am currently trying to figure out how to delete cases which have events outside of a given year
Each caseID has several events, with timestamps. I want to create a new table with only the cases which is within a year.
proc sql;
create table test2.withinyear AS
select *
from test2.alldata
where event_time_timestamp between "01JAN18:00:00:00"dt and "31DEC18:00:00:00"dt
group by case_purchasing_document
;
quit;
This does not have the effect I want, as SAS rather sorts case_purchasing_document, meaning events outside the timehorizon is deleted, which is very worrisome. Does anyone have an idea to how I can create a query in which I only select cases within 2018? Or cases starting and ending in the same year (first and last timestamp in the same year)
Thanks!
Do you want to keep observations where either of those conditions are true?
Yes, I want to keep the observations where the "between" condition is true
@Fantastisk wrote:
Hello. I am currently trying to figure out how to delete cases which have events outside of a given year
Each caseID has several events, with timestamps. I want to create a new table with only the cases which is within a year.
proc sql; create table test2.withinyear AS select * from test2.alldata where event_time_timestamp between "01JAN18:00:00:00"dt and "31DEC18:00:00:00"dt group by case_purchasing_document ; quit;
This does not have the effect I want, as SAS rather sorts case_purchasing_document, meaning events outside the timehorizon is deleted, which is very worrisome. Does anyone have an idea to how I can create a query in which I only select cases within 2018? Or cases starting and ending in the same year (first and last timestamp in the same year)
Thanks!
SAS did not "delete" anything, the WHERE clause specifies which records to KEEP. If you want to keep the other records you need to be a bit more specific about what you want.
Instead of "31DEC18:00:00:00"dt you may have intended "31DEC18:23:59:59"
^ stops at midnight of 30Dec
Or
where year(datepart(event_time_timestamp))= 2018
It sounds like you need to specify some input data and then show what the result might be as your description and "concern" appear to be in conflict.
You are right. It copied the events within 2018, but left out the events after and before. The other records are still stored in the other table.
each of the caseIDs have events, every event have a timestamp. Some events might start in 2017, and end in 2018, some start and end in 2018. I want to pick out only those starting and ending in same year, 2018, and store them in a new table
EDIT: Running the following code did only result in not selecting the events outside of 2018, the case is kept with less activities than before, when I want that case not to be kept as it has an event outside of 2018.
where year(datepart(event_time_timestamp))= 2018
this did run, but did not select any data.
I get the error
Solution:
Proc sql;
create table test2.withinyear AS
select *
from test2.allyears
group by case_concept_name
having sum(event_time_timestamp < "01JAN18:00:00:00"dt or event_time_timestamp > "31DEC18:23:59:59"dt) = 0
;
Looks messy, but it works. All cases outside of 2018 is now not copied over to withinyear.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.