Dear,
I have the following sample data from which i need to calculate best response and date of confirmation(DOC). Please help. Thank you
I calculate best response and DOC by following
1. If the response is CR or PR at visits, then I calculate number of days between visits(use "date" variable values). The number of days ge 28 than best response is earlier visit response and earlier visit date as DOC
eg: for id=1 number of days between visit1 and visit2 = 60(greater than 28) so best response="CR" and DOC=2017-01-01
I need to calculate between all visits
for id=1 number of days between visit1 and visit3 = 90(greater than 28) so best response="CR" and date of confirmation=2017-01-01
for id=1 number of days between visit2 and visit3 = 30(greater than 28) so best response="CR" and date of confirmation=2017-03-01
at the end of datastep for id=1:
id bestresponse DOC
1 CR 2017-01-01
1 CR 2017-01-01
1 CR 2017-03-01
for id=2, number of days between visit1 and visit2 = 10(less than 28) so final response="SD" and date of confirmation=blank
for id=2, number of days between visit1 and visit3 = 40(greater than 28) so final response="CR" and date of confirmation=2017-05-20. So on for next visits
id finalresponse DOC
2 SD
2 CR 2017-05-20
2 CR 2017-05-20
2 CR 2017-05-20
2 CR 2017-05-30
2 CR 2017-05-30
2 CR 2017-07-01
for id=5, after "CR" there is NE , so i need to take difference between "randomdate" and "date"(not between two dates). if (2017-01-01-2016-10-30) greater than 42 than final response=SD, else NE. DOC is blank
id final response DOC
5 SD
for id=7, 1.first visit=CR and second visit=NE so i take difference between date and radte.if greater than 42 than final response=SD, else NE. DOC is blank. For this subject i need to ignore second visit and third vist as second visit is NE.
2. (first visit -third visit)the number of "NE" between two CR is one. if there is one NE then i need to take difference between first CR and second CR. if gt than 28 than CR, else SD
id finalresponse DOC
7 SD
7 CR 2017-01-01
for id =8, 1.first visit=CR and second visit=NE so i take difference between date and radte.if greater than 42 than final response=SD, else NE. DOC is blank. For this subject i need to ignore second visit and third vist as second visit and third are NE.
2. (first visit -third visit)the number of "NE" between two CR response is two. if there is more than one NE then the final reponse= SD. I should not take difference between first and last CR
id final response DOC
8 SD
for ID=9, 1.1.first visit=PR and second visit=NE so i take difference between date and radte.if greater than 42 than final response=SD, else NE.
2. here more than one NE between two PR. When the response is PR, i take difference between first PR and last PR. If gt 28 han final reponse=PR. else SD
id finalresponse DOC
9 SD
9 PR 2017-01-01
For id =10
id finalresponse DOC
10 PR 2017-01-01 (first and second visit)
10 PR 2017-01-01 (first and third visit)
10 PR 2017-01-01 (first and fourth visit)
10 PR 2017-01-01 (first and fifth visit)
10 PR 2017-03-01 (second and third visit)
10 PR 2017-03-01 (second and fourth visit)
10 PR 2017-03-01 (second and fifth visit)
10 PR 2017-04-01 (third and fourth visit)
10 PR 2017-04-01 (third and fifth visit)
10 CR 2017-05-20 (third and fifth visit)
The intermediate output dataset i calculated is below. The finalresponse values are formated as CR=1, PR=2, SD=3 and PD=5,NE=4. Then I sorted the dataset by id finalresponse doc and took first,date to get output I need. My code working fine but lengthy and not getting the output i need when the new data is added.
Intermediate dataset
id fianlresponse DOC
1 CR 2017-01-01
1 CR 2017-01-01
1 CR 2017-03-01
2 SD
2 CR 2017-05-20
2 CR 2017-05-20
2 CR 2017-05-20
2 CR 2017-05-30
2 CR 2017-05-30
2 CR 2017-07-01
5 SD
7 SD
7 CR 2017-01-01
8 SD
9 SD
9 PR 2017-01-01
10 PR 2017-01-01
10 PR 2017-01-01
10 PR 2017-01-01
10 PR 2017-01-01
10 PR 2017-03-01
10 PR 2017-03-01
10 PR 2017-03-01
10 PR 2017-04-01
10 PR 2017-04-01
10 CR 2017-05-20
output i needed:
id final response doc
1 CR 2017-01-01
2 CR 2017-05-20
5 SD
7 CR 2017-01-01
8 SD
9 PR 2017-01-01
10 CR 2017-05-20
proc format;
value $aval
'CR' = 1
'PR' = 2
'SD' = 3
'PD' = 5
'NE' = 4
;
data one;
input id date$ 4-13 +1 response$ +1 randomdate $18-27;
datalines;
01 2017-01-01 CR 2016-10-30
01 2017-03-01 CR 2016-10-30
01 2017-04-01 CR 2016-10-30
02 2017-05-20 CR 2016-10-30
02 2017-05-30 CR 2016-10-30
02 2017-07-01 CR 2016-10-30
02 2017-08-30 CR 2016-10-30
05 2017-01-01 CR 2016-10-30
05 2017-01-30 NE 2016-10-30
07 2017-01-01 CR 2016-10-30
07 2017-01-30 NE 2016-10-30
07 2017-02-21 CR 2016-10-30
08 2017-01-01 CR 2016-10-30
08 2017-01-30 NE 2016-10-30
08 2017-02-30 NE 2016-10-30
08 2017-04-01 CR 2016-10-30
09 2017-01-01 PR 2016-10-30
09 2017-01-30 NE 2016-10-30
09 2017-02-30 NE 2016-10-30
09 2017-04-01 PR 2016-10-30
10 2017-01-01 PR 2016-10-30
10 2017-03-01 PR 2016-10-30
10 2017-04-01 PR 2016-10-30
10 2017-05-20 CR 2016-10-30
10 2017-06-20 CR 2016-10-30
;
This is a Q&A board, I do not see a question here. Seems like you have a reasonable spec there how are you going with it? Or were you after a contractor to do some work for you?
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.