Dear,
I have the following sample data from which i need to calculate best response and date of confirmation(DOC). Please help. Thank you
I calculate best response and DOC by following
1. If the response is CR or PR at visits, then I calculate number of days between visits(use "date" variable values). The number of days ge 28 than best response is earlier visit response and earlier visit date as DOC
eg: for id=1 number of days between visit1 and visit2 = 60(greater than 28) so best response="CR" and DOC=2017-01-01
I need to calculate between all visits
for id=1 number of days between visit1 and visit3 = 90(greater than 28) so best response="CR" and date of confirmation=2017-01-01
for id=1 number of days between visit2 and visit3 = 30(greater than 28) so best response="CR" and date of confirmation=2017-03-01
at the end of datastep for id=1:
id bestresponse DOC
1 CR 2017-01-01
1 CR 2017-01-01
1 CR 2017-03-01
for id=2, number of days between visit1 and visit2 = 10(less than 28) so final response="SD" and date of confirmation=blank
for id=2, number of days between visit1 and visit3 = 40(greater than 28) so final response="CR" and date of confirmation=2017-05-20. So on for next visits
id finalresponse DOC
2 SD
2 CR 2017-05-20
2 CR 2017-05-20
2 CR 2017-05-20
2 CR 2017-05-30
2 CR 2017-05-30
2 CR 2017-07-01
for id=5, after "CR" there is NE , so i need to take difference between "randomdate" and "date"(not between two dates). if (2017-01-01-2016-10-30) greater than 42 than final response=SD, else NE. DOC is blank
id final response DOC
5 SD
for id=7, 1.first visit=CR and second visit=NE so i take difference between date and radte.if greater than 42 than final response=SD, else NE. DOC is blank. For this subject i need to ignore second visit and third vist as second visit is NE.
2. (first visit -third visit)the number of "NE" between two CR is one. if there is one NE then i need to take difference between first CR and second CR. if gt than 28 than CR, else SD
id finalresponse DOC
7 SD
7 CR 2017-01-01
for id =8, 1.first visit=CR and second visit=NE so i take difference between date and radte.if greater than 42 than final response=SD, else NE. DOC is blank. For this subject i need to ignore second visit and third vist as second visit and third are NE.
2. (first visit -third visit)the number of "NE" between two CR response is two. if there is more than one NE then the final reponse= SD. I should not take difference between first and last CR
id final response DOC
8 SD
for ID=9, 1.1.first visit=PR and second visit=NE so i take difference between date and radte.if greater than 42 than final response=SD, else NE.
2. here more than one NE between two PR. When the response is PR, i take difference between first PR and last PR. If gt 28 han final reponse=PR. else SD
id finalresponse DOC
9 SD
9 PR 2017-01-01
For id =10
id finalresponse DOC
10 PR 2017-01-01 (first and second visit)
10 PR 2017-01-01 (first and third visit)
10 PR 2017-01-01 (first and fourth visit)
10 PR 2017-01-01 (first and fifth visit)
10 PR 2017-03-01 (second and third visit)
10 PR 2017-03-01 (second and fourth visit)
10 PR 2017-03-01 (second and fifth visit)
10 PR 2017-04-01 (third and fourth visit)
10 PR 2017-04-01 (third and fifth visit)
10 CR 2017-05-20 (third and fifth visit)
The intermediate output dataset i calculated is below. The finalresponse values are formated as CR=1, PR=2, SD=3 and PD=5,NE=4. Then I sorted the dataset by id finalresponse doc and took first,date to get output I need. My code working fine but lengthy and not getting the output i need when the new data is added.
Intermediate dataset
id fianlresponse DOC
1 CR 2017-01-01
1 CR 2017-01-01
1 CR 2017-03-01
2 SD
2 CR 2017-05-20
2 CR 2017-05-20
2 CR 2017-05-20
2 CR 2017-05-30
2 CR 2017-05-30
2 CR 2017-07-01
5 SD
7 SD
7 CR 2017-01-01
8 SD
9 SD
9 PR 2017-01-01
10 PR 2017-01-01
10 PR 2017-01-01
10 PR 2017-01-01
10 PR 2017-01-01
10 PR 2017-03-01
10 PR 2017-03-01
10 PR 2017-03-01
10 PR 2017-04-01
10 PR 2017-04-01
10 CR 2017-05-20
output i needed:
id final response doc
1 CR 2017-01-01
2 CR 2017-05-20
5 SD
7 CR 2017-01-01
8 SD
9 PR 2017-01-01
10 CR 2017-05-20
proc format;
value $aval
'CR' = 1
'PR' = 2
'SD' = 3
'PD' = 5
'NE' = 4
;
data one;
input id date$ 4-13 +1 response$ +1 randomdate $18-27;
datalines;
01 2017-01-01 CR 2016-10-30
01 2017-03-01 CR 2016-10-30
01 2017-04-01 CR 2016-10-30
02 2017-05-20 CR 2016-10-30
02 2017-05-30 CR 2016-10-30
02 2017-07-01 CR 2016-10-30
02 2017-08-30 CR 2016-10-30
05 2017-01-01 CR 2016-10-30
05 2017-01-30 NE 2016-10-30
07 2017-01-01 CR 2016-10-30
07 2017-01-30 NE 2016-10-30
07 2017-02-21 CR 2016-10-30
08 2017-01-01 CR 2016-10-30
08 2017-01-30 NE 2016-10-30
08 2017-02-30 NE 2016-10-30
08 2017-04-01 CR 2016-10-30
09 2017-01-01 PR 2016-10-30
09 2017-01-30 NE 2016-10-30
09 2017-02-30 NE 2016-10-30
09 2017-04-01 PR 2016-10-30
10 2017-01-01 PR 2016-10-30
10 2017-03-01 PR 2016-10-30
10 2017-04-01 PR 2016-10-30
10 2017-05-20 CR 2016-10-30
10 2017-06-20 CR 2016-10-30
;
This is a Q&A board, I do not see a question here. Seems like you have a reasonable spec there how are you going with it? Or were you after a contractor to do some work for you?
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.