I have multiple observations per person. What I'm trying to do is if COMPDT = '.' for all rows for that same ID, then I just want to keep one of the obs(they're duplicates in this case). However, if COMPFL='Y' then I want ESDT=COMPDT and ETDT=ETDT where
DSSEQ =1 for that id. How can I do this?
Current data
ID DSSEQ ESDT ETDT COMPFL COMPDT
001 1 01Jan2016 01Jan2016 N .
001 2 01Jan2016 01Jan2016 N .
002 4 07JUL2014 07JUL2014 N .
002 2 13NOV2012 13NOV2012 Y 13NOV2012
002 3 06DEC2018 06DEC2018 N .
002 1 14FEB2020 14FEB2020 N .
003 2 25OCT2018 25OCT2018 N .
003 1 25OCT2018 25OCT2018 N .
Wanted data
ID DSSEQ ESDT ETDT
001 1 01Jan2016 01Jan2016
002 2 13NOV2012 14FEB2020
003 1 25OCT2018 25OCT2018
Assuming that every ID has exactly one DSSEQ=1 record, and that any ID which has a non-missing COMPDT has only one such record, then a self-MERGE with a selective KEEP parameter will work just fine:
data want (drop=compfl compdt);
merge have (where=(dsseq=1))
have (keep=id dsseq compdt where=(compdt^=.));
by id;
if compdt^=. then esdt=compdt;
run;
This also assume data are sorted by ID.
Assuming that every ID has exactly one DSSEQ=1 record, and that any ID which has a non-missing COMPDT has only one such record, then a self-MERGE with a selective KEEP parameter will work just fine:
data want (drop=compfl compdt);
merge have (where=(dsseq=1))
have (keep=id dsseq compdt where=(compdt^=.));
by id;
if compdt^=. then esdt=compdt;
run;
This also assume data are sorted by ID.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.