BookmarkSubscribeRSS Feed
abhinayingole
Obsidian | Level 7
 

Hi ,

 

I have to create join variable in below dataset (as shown in the dataset), for each USUBJID and AE. 

To check the set of observation with overlapping AEs.

 

For eg. if you see 4th observation start date is 9APR2020, which did not fall under any of the previous AE start and end date, that is why join variable value is 2 (2nd set of observation with continuous AE) 

 

How to do it programmatically.

 

I have also paste the datalines for the test dataset.

 

 

abhinayingole_0-1702360673298.png

 

data test;
informat astdt date9. aendt date9.;
input @1 usubjid $3.
@5 astdt date9.
@15 aendt date9.
@25 ae $15.;

format astdt aendt date9.;

datalines;
301 21FEB2020 06MAR2020 Anemia
301 06MAR2020 13MAR2020 Anemia
301 06MAR2020 13MAR2020 Anemia
301 09APR2020 21MAY2020 Anemia
301 09APR2020 03JUN2020 Anemia
301 21MAY2020 01JUL2020 Anemia
301 03JUN2020 01JUL2020 Anemia
301 01JUL2020 28JUL2020 Anemia
301 01JUL2020 18NOV2020 Anemia
301 28JUL2020 26AUG2020 Anemia
301 26AUG2020 21OCT2020 Anemia
301 18NOV2020 15DEC2020 Anemia
301 18NOV2020 15DEC2020 Anemia
301 15DEC2020 .         Anemia
run;

 

data test;
informat astdt date9. aendt date9.;
input @1 usubjid $3.
@5 astdt date9.
@15 aendt date9.
@25 ae $15.;

format astdt aendt date9.;

datalines;
301 21FEB2020 06MAR2020 Anemia
301 06MAR2020 13MAR2020 Anemia
301 06MAR2020 13MAR2020 Anemia
301 09APR2020 21MAY2020 Anemia
301 09APR2020 03JUN2020 Anemia
301 21MAY2020 01JUL2020 Anemia
301 03JUN2020 01JUL2020 Anemia
301 01JUL2020 28JUL2020 Anemia
301 01JUL2020 18NOV2020 Anemia
301 28JUL2020 26AUG2020 Anemia
301 26AUG2020 21OCT2020 Anemia
301 18NOV2020 15DEC2020 Anemia
301 18NOV2020 15DEC2020 Anemia
301 15DEC2020 . Anemia
run;

12 REPLIES 12
Kurt_Bremser
Super User
data want;
set test;
by usubjid ae;
if first.ae then join = 1;
if not first.ae and astdt gt lag(aendt) then join + 1;
run;

Untested, posted from my tablet.

Note that the use of a SUM Statement causes an implied RETAIN.

abhinayingole
Obsidian | Level 7

 

Thanks for the reply, but lag function will not work here as observation no. 12 still consider as overlapping AE , as observation no. 9 is 18NOV2020

 

abhinayingole_0-1702381883739.png

 

Kurt_Bremser
Super User

So we need a different approach, by remembering a maximum end date.

data want;
set test;
by usubjid ae;
retain end;
if first.ae
then do;
  join = 1;
  end = aendt;
end;
else do;
  if astdt gt end 
  then do;
    join + 1;
    end = aendt;
  end;
  else end = max(end,aendt);
end;
drop end;
run;
abhinayingole
Obsidian | Level 7

 

With this derivation, join value is 3 for observation no. 4 , we want it as 2, as the 2nd set of AE overlapping starts from observation no. 4.

 

abhinayingole_0-1702383434224.png

 

JosvanderVelden
SAS Super FREQ
Can you post the full code you executed to get the result in the image from your last post?
abhinayingole
Obsidian | Level 7

Hi Josvander,

 

  I don't have code for this, however I wanted JOIN variable in below manner.

  I hope you wanted code for below dataset , right ? 

  Please correct me if I mistaken.

 

abhinayingole_0-1702393613043.png

 

JosvanderVelden
SAS Super FREQ
I mean the code that when executed ends up with join=3 for observation 4. When I run the code proposed by Kurt in his second post I get join=2 for observation 4.
Kurt_Bremser
Super User

@abhinayingole wrote:

 

With this derivation, join value is 3 for observation no. 4 , we want it as 2, as the 2nd set of AE overlapping starts from observation no. 4.

 

abhinayingole_0-1702383434224.png

 


This cannot be the result of my code. In my code, join is only incremented, never decremented, so it can't get a 2 after a 3.

Please post the complete log from the code that gave you this result.

abhinayingole
Obsidian | Level 7

Hi Kurt,

 

Yes , you are right. apology.

there must be some mistake at my end, but it is working fine now.

 

 

abhinayingole
Obsidian | Level 7

 

Hi Kurt,

 

Thanks for the solution. it worked.

However, I have one more scenario.

 

with your derivation, we got JOIN value for 7th observation.

we need the JOIN value as 1 for this observation. as we don't know the end date of pervious observation (or missing end date for any of the previous records), so we consider this as cutaneous overlap AE.

 

The last observation JOIN value 2 is correct, though it has missing end date, but no other previous record has missing end dates.

 

do you have any solution for this.

I am pasting datalines for this test dataset below for refrence.

 

abhinayingole_0-1702482923462.png

 

data test1;
informat astdt date9. aendt date9.;
input @1 usubjid $3.
@5 astdt date9.
@15 aendt date9.
@25 ae $15.;

format astdt aendt date9.;

datalines;
101	10JUN2016 06JUL2016 vomit
101	06JUL2016 08JUL2016 vomit
101	06JUL2016 08JUL2016 vomit
101	08JUL2016 22JUL2016 vomit
101	22JUL2016 28JUL2016 vomit
101	28JUL2016 .	        vomit
101	16SEP2016 .	        vomit
202	27DEC2022 17JAN2023 vomit
202	03JAN2023 17JAN2023 vomit
202	10JAN2023 24JAN2023 vomit
202	17JAN2023 28FEB2023 vomit
202	31JAN2023 14FEB2023 vomit
202	31JAN2023 14FEB2023 vomit
202	21FEB2023 28FEB2023 vomit
202	28FEB2023 21MAR2023 vomit
202	11APR2023 .	        vomit
;
run;

 

ballardw
Super User

If you are going to post a data step that reads inline data (datalines) with fixed column input you MUST paste the code into either a code or text box opened with the "running man" or </> icons above the main message window.

The forum software will reformat text pasted into the main message window.

So your data step has invalid data for the last observation because spaces have been removed and the format for reading the Aendt variable then reads part of Anemia in the columns for the date.

 

 

abhinayingole
Obsidian | Level 7

Hi , as requested , I am pasting code at </> , please let me know if it is still invalid

 
data test;
informat astdt date9. aendt date9.;
input @1 usubjid $3.
@5 astdt date9.
@15 aendt date9.
@25 ae $15.;

format astdt aendt date9.;

datalines;
301 21FEB2020 06MAR2020 Anemia
301 06MAR2020 13MAR2020 Anemia
301 06MAR2020 13MAR2020 Anemia
301 09APR2020 21MAY2020 Anemia
301 09APR2020 03JUN2020 Anemia
301 21MAY2020 01JUL2020 Anemia
301 03JUN2020 01JUL2020 Anemia
301 01JUL2020 28JUL2020 Anemia
301 01JUL2020 18NOV2020 Anemia
301 28JUL2020 26AUG2020 Anemia
301 26AUG2020 21OCT2020 Anemia
301 18NOV2020 15DEC2020 Anemia
301 18NOV2020 15DEC2020 Anemia
301 15DEC2020 .         Anemia
run;

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 2118 views
  • 0 likes
  • 4 in conversation