DATA Step, Macro, Functions and more

Remove Row by User ID

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 76
Accepted Solution

Remove Row by User ID

[ Edited ]

Hello all,

 

I have a dataset with multple rows per a user id and I would like to delete rows that are less than the date that a specific conditionstart for each id.

This is what it looks like before:

ID Start_of_Event Collection_Date
1 0 15-Jan
1 1 16-Jan
1 0 17-Jan
2 1 20-Feb
2 0 28-Feb
3 0 30-Oct
3 1 31-Oct
3 0 1-Nov
4 1 15-Jul
4 0 16-Jul
4 0 17-Jul
4 0 18-Jul

 

This is what it looks like after:

ID Start_of_Event Collection_Date
1 1 16-Jan
1 0 17-Jan
2 1 20-Feb
2 0 28-Feb
3 1 31-Oct
3 0 1-Nov
4 1 15-Jul
4 0 16-Jul
4 0 17-Jul
4 0 18-Jul

 

 

I have tried using logic from this, this, and this, but I have not had much luck. I was trying the lag function, but from what I have read it doesn't work well in conditionals or by id. I have also tried using first.id, but not all ids have dates that are less than the start date.

 

Any help would be much appreciated.

 

EDIT: To expand

 

To expand: I would like like to delete observations that are less than the collection date of the condition (Start_of_Event = 1). For example: ID = 1 has a single observation that less than the (15-Jan < 16-Jan), only one observation will be removed. ID = 2 does not have any events prior, so no rows will be removed for that ID. I have sorted the data by id and collection date.

 


Accepted Solutions
Solution
‎03-16-2017 03:41 PM
Frequent Contributor
Posts: 75

Re: Remove Row by User ID

Posted in reply to daszlosek

I sthis what you need? I get the results you have listed:

data have;
input ID Start_of_Event Collection_Date  date9.;
format Collection_Date date9.;
cards;
1 0 15Jan2017
1 1 16Jan2017
1 0 17Jan2017 
2 1 20Feb2017 
2 0 28Feb2017 
3 0 30Oct2017 
3 1 31Oct2017 
3 0 1Nov2017 
4 1 15Jul2017 
4 0 16Jul2017 
4 0 17Jul2017 
4 0 18Jul2017 
;
run;

proc sql;
create table want as
select a.*
from have a, have b
where a.id=b.id
and a.Collection_Date>=b.Collection_Date
and b.Start_of_Event=1;
quit;

View solution in original post


All Replies
Regular Contributor
Posts: 240

Re: Remove Row by User ID

Posted in reply to daszlosek

Hello,

 

data want;
set have;
by ID;
retain start;
if first.ID then start=0;

if start=0 then start=Start_of_Event;

if start;
run;
Super User
Posts: 11,343

Re: Remove Row by User ID

Posted in reply to daszlosek

If your requirement is more complex then "delet if the first id and the start_event is 0" the please expand on you condtion.

If not then perhaps

data want;
   set have;
   by id;
   if first.id and start_event=0 then delete;
run;

assuming the data is sorted by id. If not you would NOTSORTED after ID in the by statement.

 

Solution
‎03-16-2017 03:41 PM
Frequent Contributor
Posts: 75

Re: Remove Row by User ID

Posted in reply to daszlosek

I sthis what you need? I get the results you have listed:

data have;
input ID Start_of_Event Collection_Date  date9.;
format Collection_Date date9.;
cards;
1 0 15Jan2017
1 1 16Jan2017
1 0 17Jan2017 
2 1 20Feb2017 
2 0 28Feb2017 
3 0 30Oct2017 
3 1 31Oct2017 
3 0 1Nov2017 
4 1 15Jul2017 
4 0 16Jul2017 
4 0 17Jul2017 
4 0 18Jul2017 
;
run;

proc sql;
create table want as
select a.*
from have a, have b
where a.id=b.id
and a.Collection_Date>=b.Collection_Date
and b.Start_of_Event=1;
quit;

Super User
Posts: 10,046

Re: Remove Row by User ID

Posted in reply to daszlosek

data have;
input ID Start_of_Event Collection_Date  date9.;
format Collection_Date date9.;
cards;
1 0 15Jan2017
1 1 16Jan2017
1 0 17Jan2017 
2 1 20Feb2017 
2 0 28Feb2017 
3 0 30Oct2017 
3 1 31Oct2017 
3 0 1Nov2017 
4 1 15Jul2017 
4 0 16Jul2017 
4 0 17Jul2017 
4 0 18Jul2017 
;
run;
data want;
 set have;
 by ID;
 retain found;
 if first.id then found=0;
 if start_of_event=1 then found=1;
 if found;
run;


☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 160 views
  • 2 likes
  • 5 in conversation