BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Carmine_Rossi
Calcite | Level 5

Good day SAS users,

 

I have a question concerning data manipulation in a longitudinal analysis where I wanted to drop all observations after the 1st occurrence of an event. Suppose I had the following data set and I wanted to drop the observations that I have listed. They all occur after the first instance of an event. How would this be done?

 

Id

Time

Event

Drop

1

1

0

 

1

2

0

 

1

3

1

 

1

4

0

<- Drop

2

1

0

 

2

2

1

 

2

3

1

<- Drop

2

4

1

<- Drop

3

1

0

 

3

2

1

 

3

3

0

<- Drop

3

4

1

<- Drop

 

Have a nice day,

-Carmine

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

This can easily be done by retaining a KEEPFLAG  (1=keep  >1 means don't keep).

 

 

data have;
  input id time event;
datalines;
1 1 0
1 2 0
1 3 1
1 4 0
2 1 0 
2 2 1
2 3 1
2 4 1
3 1 0
3 2 1
3 3 0
3 4 1
run;

data want (drop=_:);
  set have;
  by id;
  if first.id then _keepflag=1;
  if _keepflag=1;
  _keepflag+event;
run;

 

Notes:

  1. keepflag is initalized to 1 at the start of a by group.
  2. the statement keepflag+event is a sum statement which increments keepflag whenever a non-zero event is encountered. As a sum statement it also causes keepflag to be retained from record to record.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

5 REPLIES 5
Reeza
Super User

1. Create a flag if the event has occurred. 

2. Use RETAIN to keep the flag

3. If the flag is set delete the record. 

4. Reset the flag at the first record for each ID

 

They end up being coded out of order because you want to not delete the record with the event. 

 

data want;

set have;

by ID;

retain flag; *Step 2;

if first.ID then flag=0; *Step 4;

if flag=1 then delete; *Step 3;

if event=1 then flag=1; *Step 1;

run;

@Carmine_Rossi wrote:

Good day SAS users,

 

I have a question concerning data manipulation in a longitudinal analysis where I wanted to drop all observations after the 1st occurrence of an event. Suppose I had the following data set and I wanted to drop the observations that I have listed. They all occur after the first instance of an event. How would this be done?

 

Id

Time

Event

Drop

1

1

0

 

1

2

0

 

1

3

1

 

1

4

0

<- Drop

2

1

0

 

2

2

1

 

2

3

1

<- Drop

2

4

1

<- Drop

3

1

0

 

3

2

1

 

3

3

0

<- Drop

3

4

1

<- Drop

 

Have a nice day,

-Carmine


 

Carmine_Rossi
Calcite | Level 5

Good day Reeza,

 

Thank you for your help. The code provided seems to stop at the first event and then drops everything else, including all other patients after the first event has been identified. It is what I wanted to do, but by study_ID. I didn't forget the BY statement you provided in your code. Even with it, it does as described in my second sentence.

 

Thanks again,

Carmine

Reeza
Super User

With your sample data and my exact code from the previous post it works exactly as you specified. See the tested version below.

Please post your code and log if you're having issues or explain in detail how it doesn't meet your guidelines. Its possible I overlooked something. 

 

In the future please post your data as at minimum text or a preferably data step. I took the time to write code to generate your data this time but won't again. 

 

data have;
    do ID=1 to 3;

        do time=1 to 4;
            event=0;

            if (id=1 and time=3) or (id=2 and time in(2, 3, 4)) or id=3 and time in (2, 
                4) then
                    event=1;
            output;
        end;
    end;
run;

proc sort data=have;
    by id time;
run;

data want;
    set have;
    by ID;
    retain flag;
    *Step 2;

    if first.ID then
        flag=0;
    *Step 4;

    if flag=1 then
        delete;
    *Step 3;

    if event=1 then
        flag=1;
    *Step 1;
run;

proc print data=have;
proc print data=want;
run;

@Carmine_Rossi wrote:

Good day Reeza,

 

Thank you for your help. The code provided seems to stop at the first event and then drops everything else, including all other patients after the first event has been identified. It is what I wanted to do, but by study_ID. I didn't forget the BY statement you provided in your code. Even with it, it does as described in my second sentence.

 

Thanks again,

Carmine


 

mkeintz
PROC Star

This can easily be done by retaining a KEEPFLAG  (1=keep  >1 means don't keep).

 

 

data have;
  input id time event;
datalines;
1 1 0
1 2 0
1 3 1
1 4 0
2 1 0 
2 2 1
2 3 1
2 4 1
3 1 0
3 2 1
3 3 0
3 4 1
run;

data want (drop=_:);
  set have;
  by id;
  if first.id then _keepflag=1;
  if _keepflag=1;
  _keepflag+event;
run;

 

Notes:

  1. keepflag is initalized to 1 at the start of a by group.
  2. the statement keepflag+event is a sum statement which increments keepflag whenever a non-zero event is encountered. As a sum statement it also causes keepflag to be retained from record to record.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Carmine_Rossi
Calcite | Level 5

Thank you very much mkeintz! This is what I was looking for. I appreciate the time you and others have contributed in helping to answer this question!

 

All the best,

-Carmine

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 3082 views
  • 2 likes
  • 3 in conversation