Current table
patientid | Event type | date | timepoint1 | timepoint2 | timepoint3 | timepoint4 |
1 | Baseline | 1/28/2020 | 1200 | 600 | 1200 | |
1 | Daily | 1/28/2020 | 1211 | 1211 | ||
1 | Daily | 1/29/2020 | 1221 | 1000 | 1221 | |
2 | Baseline | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
2 | Daily | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
2 | Daily | 11/21/2019 | 1643 | 1645 | 1645 | 1644 |
2 | Daily | 11/21/2019 | 2225 | 2225 | 2225 | 2225 |
3 | Baseline | 11/25/2019 | 1440 | 1440 | 1440 | 1440 |
3 | Daily | 11/25/2019 | 1520 | 1520 | 1520 | 1520 |
4 | Baseline | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
4 | Daily | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
4 | Daily | 11/11/2019 | 1845 | 1847 | 1845 | 1845 |
within each same patient group, i would like to create a flag if date equals to next date within each patient AND all time points match, then baseline = 1 or 0
patientid | Event type | date | timepoint1 | timepoint2 | timepoint3 | timepoint4 | baseline |
1 | Baseline | 1/28/2020 | 1200 | 600 | 1200 | 1 | |
1 | Daily | 1/28/2020 | 1211 | 1211 | 0 | ||
1 | Daily | 1/29/2020 | 1221 | 1000 | 1221 | 0 | |
2 | Baseline | 11/21/2019 | 1011 | 1359 | 1200 | 1200 | 1 |
2 | Daily | 11/21/2019 | 1011 | 1359 | 1200 | 1200 | 1 |
2 | Daily | 11/21/2019 | 1643 | 1645 | 1645 | 1644 | 0 |
2 | Daily | 11/21/2019 | 2225 | 2225 | 2225 | 2225 | 0 |
3 | Baseline | 11/25/2019 | 1440 | 1440 | 1440 | 1440 | 1 |
3 | Daily | 11/25/2019 | 1520 | 1520 | 1520 | 1520 | 0 |
4 | Baseline | 11/11/2019 | 1259 | 1300 | 1301 | 1300 | 1 |
4 | Daily | 11/11/2019 | 1259 | 1300 | 1301 | 1300 | 1 |
4 | Daily | 11/11/2019 | 1845 | 1847 | 1845 | 1845 | 0 |
I tried to divide daily and baseline tables into TWO different tables
and left join by date and then timepoints. i am really stuck
Maybe I'm mis-understanding but your written logic doesn't line up with what you've flagged as baseline=1.
If you are flagging for removal, does this remove the records you want out?
data have;
input patientid Event_type :$15. date :mmddyy10. timepoint1 timepoint2
timepoint3 timepoint4;
format date mmddyy10.;
datalines;
1 Baseline 01/28/2020 1200 600 1200 .
1 Daily 01/28/2020 1211 . 1211 .
1 Daily 01/29/2020 1221 1000 1221 .
2 Baseline 11/21/2019 1011 1359 1200 1200
2 Daily 11/21/2019 1011 1359 1200 1200
2 Daily 11/21/2019 1643 1645 1645 1644
2 Daily 11/21/2019 2225 2225 2225 2225
3 Baseline 11/25/2019 1440 1440 1440 1440
3 Daily 11/25/2019 1520 1520 1520 1520
4 Baseline 11/11/2019 1259 1300 1301 1300
4 Daily 11/11/2019 1259 1300 1301 1300
4 Daily 11/11/2019 1845 1847 1845 1845
;
run;
proc sort data=have nodupkey out=want dupout=checkdups;
by patientid date timepoint:;
run;
Duplicates Removed:
patientid | Event_type | date | timepoint1 | timepoint2 | timepoint3 | timepoint4 |
2 | Daily | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
4 | Daily | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
Want:
patientid | Event_type | date | timepoint1 | timepoint2 | timepoint3 | timepoint4 |
1 | Baseline | 01/28/2020 | 1200 | 600 | 1200 | . |
1 | Daily | 01/28/2020 | 1211 | . | 1211 | . |
1 | Daily | 01/29/2020 | 1221 | 1000 | 1221 | . |
2 | Baseline | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
2 | Daily | 11/21/2019 | 1643 | 1645 | 1645 | 1644 |
2 | Daily | 11/21/2019 | 2225 | 2225 | 2225 | 2225 |
3 | Baseline | 11/25/2019 | 1440 | 1440 | 1440 | 1440 |
3 | Daily | 11/25/2019 | 1520 | 1520 | 1520 | 1520 |
4 | Baseline | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
4 | Daily | 11/11/2019 | 1845 | 1847 | 1845 | 1845 |
Only one row matches your criteria
patientid | Event type | date | timepoint1 | timepoint2 | timepoint3 | timepoint4 |
1 | Baseline | 1/28/2020 | 1200 | 600 | 1200 | |
1 | Daily | 1/28/2020 | 1211 | 1211 | ||
1 | Daily | 1/29/2020 | 1221 | 1000 | 1221 | |
2 | Baseline | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
2 | Daily | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
2 | Daily | 11/21/2019 | 1643 | 1645 | 1645 | 1644 |
2 | Daily | 11/21/2019 | 2225 | 2225 | 2225 | 2225 |
3 | Baseline | 11/25/2019 | 1440 | 1440 | 1440 | 1440 |
3 | Daily | 11/25/2019 | 1520 | 1520 | 1520 | 1520 |
4 | Baseline | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
4 | Daily | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
4 | Daily | 11/11/2019 | 1845 | 1847 | 1845 | 1845 |
within each same patient group, i would like to create a flag if date equals to next date within each patient AND all time points match
Sorry I meant as in if group is Baseline or Daily (consequential) for the same patient AND all time points and dates match then baseline = 1
> I meant as in if group is Baseline or Daily (consequential) for the same patient AND all time points and dates match then baseline = 1
This still doesn't make match your result.
It appears to me that:
The program below is untested (I'm not bothering to put your data into a DATA step):
data want;
set have;
by patientid;
array tp timepoint: ; /* Editted */
baseline=1; /* Default to 1 */
do over tp; /* If any timepoint difference, convert to 0 */
if lag(tp) ^= tp then baseline=0;
end;
if first.patientid=1 then baseline=1;
run;
If, however, the only daily records actually eligible for baseline=1 are the 2nd records of each patientid, whose timepoints match the baseline timepoints, then this more compact program will do:
data want;
set have ;
by patientid timepoint1 timepoint2 timepoint3 notsorted;
baseline=0; /* Editted additional statement */
if first.patientid=1 then baseline=1;
if lag(first.patientid)=1 and first.patientid=0 and first.timepoint3=0 then baseline=1;
run;
Notes:
You need explain more details about your question.
data have;
infile cards expandtabs truncover;
input patientid Eventtype $ date : $20. timepoint1 timepoint2 timepoint3 timepoint4;
cards;
1 Baseline 1/28/2020 1200 600 1200
1 Daily 1/28/2020 1211 1211
1 Daily 1/29/2020 1221 1000 1221
2 Baseline 11/21/2019 1011 1359 1200 1200
2 Daily 11/21/2019 1011 1359 1200 1200
2 Daily 11/21/2019 1643 1645 1645 1644
2 Daily 11/21/2019 2225 2225 2225 2225
3 Baseline 11/25/2019 1440 1440 1440 1440
3 Daily 11/25/2019 1520 1520 1520 1520
4 Baseline 11/11/2019 1259 1300 1301 1300
4 Daily 11/11/2019 1259 1300 1301 1300
4 Daily 11/11/2019 1845 1847 1845 1845
;
data want;
set have;
by patientid;
retain _d _t1-_t4 .;
baseline=0;
if first.patientid then do;
call missing(of _:);
if Eventtype='Baseline' then do;
baseline=1;
_d=date;_t1=timepoint1;_t2=timepoint2;
_t3=timepoint3;_t4=timepoint4;
end;
end;
if patientid=lag(patientid) and _d=date and
_t1=timepoint1 and _t2=timepoint2 and
_t3=timepoint3 and _t4=timepoint4 then baseline=1;
drop _:;
run;
proc print;run;
Maybe I'm mis-understanding but your written logic doesn't line up with what you've flagged as baseline=1.
If you are flagging for removal, does this remove the records you want out?
data have;
input patientid Event_type :$15. date :mmddyy10. timepoint1 timepoint2
timepoint3 timepoint4;
format date mmddyy10.;
datalines;
1 Baseline 01/28/2020 1200 600 1200 .
1 Daily 01/28/2020 1211 . 1211 .
1 Daily 01/29/2020 1221 1000 1221 .
2 Baseline 11/21/2019 1011 1359 1200 1200
2 Daily 11/21/2019 1011 1359 1200 1200
2 Daily 11/21/2019 1643 1645 1645 1644
2 Daily 11/21/2019 2225 2225 2225 2225
3 Baseline 11/25/2019 1440 1440 1440 1440
3 Daily 11/25/2019 1520 1520 1520 1520
4 Baseline 11/11/2019 1259 1300 1301 1300
4 Daily 11/11/2019 1259 1300 1301 1300
4 Daily 11/11/2019 1845 1847 1845 1845
;
run;
proc sort data=have nodupkey out=want dupout=checkdups;
by patientid date timepoint:;
run;
Duplicates Removed:
patientid | Event_type | date | timepoint1 | timepoint2 | timepoint3 | timepoint4 |
2 | Daily | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
4 | Daily | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
Want:
patientid | Event_type | date | timepoint1 | timepoint2 | timepoint3 | timepoint4 |
1 | Baseline | 01/28/2020 | 1200 | 600 | 1200 | . |
1 | Daily | 01/28/2020 | 1211 | . | 1211 | . |
1 | Daily | 01/29/2020 | 1221 | 1000 | 1221 | . |
2 | Baseline | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
2 | Daily | 11/21/2019 | 1643 | 1645 | 1645 | 1644 |
2 | Daily | 11/21/2019 | 2225 | 2225 | 2225 | 2225 |
3 | Baseline | 11/25/2019 | 1440 | 1440 | 1440 | 1440 |
3 | Daily | 11/25/2019 | 1520 | 1520 | 1520 | 1520 |
4 | Baseline | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
4 | Daily | 11/11/2019 | 1845 | 1847 | 1845 | 1845 |
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.