- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Current table
patientid | Event type | date | timepoint1 | timepoint2 | timepoint3 | timepoint4 |
1 | Baseline | 1/28/2020 | 1200 | 600 | 1200 | |
1 | Daily | 1/28/2020 | 1211 | 1211 | ||
1 | Daily | 1/29/2020 | 1221 | 1000 | 1221 | |
2 | Baseline | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
2 | Daily | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
2 | Daily | 11/21/2019 | 1643 | 1645 | 1645 | 1644 |
2 | Daily | 11/21/2019 | 2225 | 2225 | 2225 | 2225 |
3 | Baseline | 11/25/2019 | 1440 | 1440 | 1440 | 1440 |
3 | Daily | 11/25/2019 | 1520 | 1520 | 1520 | 1520 |
4 | Baseline | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
4 | Daily | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
4 | Daily | 11/11/2019 | 1845 | 1847 | 1845 | 1845 |
within each same patient group, i would like to create a flag if date equals to next date within each patient AND all time points match, then baseline = 1 or 0
patientid | Event type | date | timepoint1 | timepoint2 | timepoint3 | timepoint4 | baseline |
1 | Baseline | 1/28/2020 | 1200 | 600 | 1200 | 1 | |
1 | Daily | 1/28/2020 | 1211 | 1211 | 0 | ||
1 | Daily | 1/29/2020 | 1221 | 1000 | 1221 | 0 | |
2 | Baseline | 11/21/2019 | 1011 | 1359 | 1200 | 1200 | 1 |
2 | Daily | 11/21/2019 | 1011 | 1359 | 1200 | 1200 | 1 |
2 | Daily | 11/21/2019 | 1643 | 1645 | 1645 | 1644 | 0 |
2 | Daily | 11/21/2019 | 2225 | 2225 | 2225 | 2225 | 0 |
3 | Baseline | 11/25/2019 | 1440 | 1440 | 1440 | 1440 | 1 |
3 | Daily | 11/25/2019 | 1520 | 1520 | 1520 | 1520 | 0 |
4 | Baseline | 11/11/2019 | 1259 | 1300 | 1301 | 1300 | 1 |
4 | Daily | 11/11/2019 | 1259 | 1300 | 1301 | 1300 | 1 |
4 | Daily | 11/11/2019 | 1845 | 1847 | 1845 | 1845 | 0 |
I tried to divide daily and baseline tables into TWO different tables
and left join by date and then timepoints. i am really stuck
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Maybe I'm mis-understanding but your written logic doesn't line up with what you've flagged as baseline=1.
If you are flagging for removal, does this remove the records you want out?
data have;
input patientid Event_type :$15. date :mmddyy10. timepoint1 timepoint2
timepoint3 timepoint4;
format date mmddyy10.;
datalines;
1 Baseline 01/28/2020 1200 600 1200 .
1 Daily 01/28/2020 1211 . 1211 .
1 Daily 01/29/2020 1221 1000 1221 .
2 Baseline 11/21/2019 1011 1359 1200 1200
2 Daily 11/21/2019 1011 1359 1200 1200
2 Daily 11/21/2019 1643 1645 1645 1644
2 Daily 11/21/2019 2225 2225 2225 2225
3 Baseline 11/25/2019 1440 1440 1440 1440
3 Daily 11/25/2019 1520 1520 1520 1520
4 Baseline 11/11/2019 1259 1300 1301 1300
4 Daily 11/11/2019 1259 1300 1301 1300
4 Daily 11/11/2019 1845 1847 1845 1845
;
run;
proc sort data=have nodupkey out=want dupout=checkdups;
by patientid date timepoint:;
run;
Duplicates Removed:
patientid | Event_type | date | timepoint1 | timepoint2 | timepoint3 | timepoint4 |
2 | Daily | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
4 | Daily | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
Want:
patientid | Event_type | date | timepoint1 | timepoint2 | timepoint3 | timepoint4 |
1 | Baseline | 01/28/2020 | 1200 | 600 | 1200 | . |
1 | Daily | 01/28/2020 | 1211 | . | 1211 | . |
1 | Daily | 01/29/2020 | 1221 | 1000 | 1221 | . |
2 | Baseline | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
2 | Daily | 11/21/2019 | 1643 | 1645 | 1645 | 1644 |
2 | Daily | 11/21/2019 | 2225 | 2225 | 2225 | 2225 |
3 | Baseline | 11/25/2019 | 1440 | 1440 | 1440 | 1440 |
3 | Daily | 11/25/2019 | 1520 | 1520 | 1520 | 1520 |
4 | Baseline | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
4 | Daily | 11/11/2019 | 1845 | 1847 | 1845 | 1845 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Only one row matches your criteria
patientid | Event type | date | timepoint1 | timepoint2 | timepoint3 | timepoint4 |
1 | Baseline | 1/28/2020 | 1200 | 600 | 1200 | |
1 | Daily | 1/28/2020 | 1211 | 1211 | ||
1 | Daily | 1/29/2020 | 1221 | 1000 | 1221 | |
2 | Baseline | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
2 | Daily | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
2 | Daily | 11/21/2019 | 1643 | 1645 | 1645 | 1644 |
2 | Daily | 11/21/2019 | 2225 | 2225 | 2225 | 2225 |
3 | Baseline | 11/25/2019 | 1440 | 1440 | 1440 | 1440 |
3 | Daily | 11/25/2019 | 1520 | 1520 | 1520 | 1520 |
4 | Baseline | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
4 | Daily | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
4 | Daily | 11/11/2019 | 1845 | 1847 | 1845 | 1845 |
within each same patient group, i would like to create a flag if date equals to next date within each patient AND all time points match
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sorry I meant as in if group is Baseline or Daily (consequential) for the same patient AND all time points and dates match then baseline = 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
> I meant as in if group is Baseline or Daily (consequential) for the same patient AND all time points and dates match then baseline = 1
This still doesn't make match your result.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It appears to me that:
- Every ID begins with a baseline record, each ID has only one baseline record, and each baseline record has (unsurprisingly) the baseline flag set to one. The programs below assumes so.
- But for DAILY records, baseline=1 ONLY if all the timepoint values match the preceding record. In your example the only daily records that fill this condition are the records immediate following the baseline record. Is that true for your data? Or can there be any other daily record that gets baseline=1 because its timepoint values match the preceding daily record? If so, this program will flag such baseline records.
The program below is untested (I'm not bothering to put your data into a DATA step):
data want;
set have;
by patientid;
array tp timepoint: ; /* Editted */
baseline=1; /* Default to 1 */
do over tp; /* If any timepoint difference, convert to 0 */
if lag(tp) ^= tp then baseline=0;
end;
if first.patientid=1 then baseline=1;
run;
If, however, the only daily records actually eligible for baseline=1 are the 2nd records of each patientid, whose timepoints match the baseline timepoints, then this more compact program will do:
data want;
set have ;
by patientid timepoint1 timepoint2 timepoint3 notsorted;
baseline=0; /* Editted additional statement */
if first.patientid=1 then baseline=1;
if lag(first.patientid)=1 and first.patientid=0 and first.timepoint3=0 then baseline=1;
run;
Notes:
- The "notsorted" keyword at the end of the BY statement tells sas that the data are grouped, but not necessarily in ascending order for all the by-variables.
- The "if first.patientid=1" test for whether the record-in-hand is the start of a patientid (which this program assumes is a baseline type of record).
- The "if lag(first.patientid)=1 and first.patientid=0" condition is true only for the 2nd record of each patientid (Note the "and first.patientid=0" part is required in case the preceding record was the only record for the preceding patientid).
and
the "and first.timepoint3=0" condition is true only when all the by-variables in the current record match their values in the following record.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You need explain more details about your question.data have;
infile cards expandtabs truncover;
input patientid Eventtype $ date : $20. timepoint1 timepoint2 timepoint3 timepoint4;
cards;
1 Baseline 1/28/2020 1200 600 1200
1 Daily 1/28/2020 1211 1211
1 Daily 1/29/2020 1221 1000 1221
2 Baseline 11/21/2019 1011 1359 1200 1200
2 Daily 11/21/2019 1011 1359 1200 1200
2 Daily 11/21/2019 1643 1645 1645 1644
2 Daily 11/21/2019 2225 2225 2225 2225
3 Baseline 11/25/2019 1440 1440 1440 1440
3 Daily 11/25/2019 1520 1520 1520 1520
4 Baseline 11/11/2019 1259 1300 1301 1300
4 Daily 11/11/2019 1259 1300 1301 1300
4 Daily 11/11/2019 1845 1847 1845 1845
;
data want;
set have;
by patientid;
retain _d _t1-_t4 .;
baseline=0;
if first.patientid then do;
call missing(of _:);
if Eventtype='Baseline' then do;
baseline=1;
_d=date;_t1=timepoint1;_t2=timepoint2;
_t3=timepoint3;_t4=timepoint4;
end;
end;
if patientid=lag(patientid) and _d=date and
_t1=timepoint1 and _t2=timepoint2 and
_t3=timepoint3 and _t4=timepoint4 then baseline=1;
drop _:;
run;
proc print;run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Maybe I'm mis-understanding but your written logic doesn't line up with what you've flagged as baseline=1.
If you are flagging for removal, does this remove the records you want out?
data have;
input patientid Event_type :$15. date :mmddyy10. timepoint1 timepoint2
timepoint3 timepoint4;
format date mmddyy10.;
datalines;
1 Baseline 01/28/2020 1200 600 1200 .
1 Daily 01/28/2020 1211 . 1211 .
1 Daily 01/29/2020 1221 1000 1221 .
2 Baseline 11/21/2019 1011 1359 1200 1200
2 Daily 11/21/2019 1011 1359 1200 1200
2 Daily 11/21/2019 1643 1645 1645 1644
2 Daily 11/21/2019 2225 2225 2225 2225
3 Baseline 11/25/2019 1440 1440 1440 1440
3 Daily 11/25/2019 1520 1520 1520 1520
4 Baseline 11/11/2019 1259 1300 1301 1300
4 Daily 11/11/2019 1259 1300 1301 1300
4 Daily 11/11/2019 1845 1847 1845 1845
;
run;
proc sort data=have nodupkey out=want dupout=checkdups;
by patientid date timepoint:;
run;
Duplicates Removed:
patientid | Event_type | date | timepoint1 | timepoint2 | timepoint3 | timepoint4 |
2 | Daily | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
4 | Daily | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
Want:
patientid | Event_type | date | timepoint1 | timepoint2 | timepoint3 | timepoint4 |
1 | Baseline | 01/28/2020 | 1200 | 600 | 1200 | . |
1 | Daily | 01/28/2020 | 1211 | . | 1211 | . |
1 | Daily | 01/29/2020 | 1221 | 1000 | 1221 | . |
2 | Baseline | 11/21/2019 | 1011 | 1359 | 1200 | 1200 |
2 | Daily | 11/21/2019 | 1643 | 1645 | 1645 | 1644 |
2 | Daily | 11/21/2019 | 2225 | 2225 | 2225 | 2225 |
3 | Baseline | 11/25/2019 | 1440 | 1440 | 1440 | 1440 |
3 | Daily | 11/25/2019 | 1520 | 1520 | 1520 | 1520 |
4 | Baseline | 11/11/2019 | 1259 | 1300 | 1301 | 1300 |
4 | Daily | 11/11/2019 | 1845 | 1847 | 1845 | 1845 |