Dear,
i need help in my code. I trying to create find best response in the data.
I have to follow:1. best time response before orres='pd' or up to visit='cy6' and follow order cr>pr>sd>pd>ne
output needed;
id param aval visit date
1 bor pr cy6 2016-11-10
2 bor sd cy4 2016-08-02
3 bor cr cy4 2016-08-02
4 bor cr cy4 2016-08-02
5 bor pr cy5 2016-11-10
data one;
input id test $ orres $ visit $ date $10.;
datalines;
1 ovrp sd cy5 2016-08-02
1 ovrp pr cy6 2016-11-10
1 ovrp pd cy7 2016-12-10
2 ovrp sd cy4 2016-08-02
2 ovrp pd cy5 2016-11-10
2 ovrp pd cy6 2016-11-10
3 ovrp cr cy4 2016-08-02
3 ovrp pr cy5 2016-11-10
3 ovrp pd cy6 2016-11-10
4 ovrp cr cy4 2016-08-02
4 ovrp pr cy5 2016-11-10
5 ovrp cr cy4 2016-08-02
5 ovrp pr cy5 2016-11-10
5 ovrp sd dvs 2016-11-10
;
proc sort data=one;
by id date visit;
run;
data two;
retain param avalc adt;
do until(last.id);
set one;
by id;
if test='pd' or visit='cy6' then do;
param='bor';
avalc=orres;
adt=input(date,is8601da.);
format adt yymmdd10.;
output;
end;
end;
run;
1.Why this output?
3 bor cr cy4 2016-08-02
There is is no 'pd' or 'cy6'.
2. What does this mean?
follow order cr>pr>sd>pd>ne
Sorry. i did not explain clearly.
pd=progressive disease
bor=best respose
For each subject I have to find 'bor' before the patient on or before visit='cy6' or befoe='pd'.
if there are none, then I need to pick the best by following order(cr>pr>sd>pd>ne).
For id=4, there is no orres=' pd' and visit='cy6' so the bor='cr' because he has 'cr' at visit=cy4 and 'pr' at visit='cy5'. By following order(cr>pr>sd>pd>ne) means if there are orres='cr' and orres='pr' then take orres='cr'.
Thank you
output needed;
id param aval visit date
1 bor pr cy6 2016-11-10
2 bor sd cy4 2016-08-02
3 bor cr cy4 2016-08-02
4 bor cr cy4 2016-08-02
5 bor pr cy5 2016-11-10
Like this?
data ONE ;
input ID TEST $ ORRES $ VISIT $ DATE $10.;
datalines;
1 ovrp sd cy5 2016-08-02
1 ovrp pr cy6 2016-11-10
1 ovrp pd cy7 2016-12-10
2 ovrp sd cy4 2016-08-02
2 ovrp pd cy5 2016-11-10
2 ovrp pd cy6 2016-11-10
3 ovrp cr cy4 2016-08-02
3 ovrp pr cy5 2016-11-10
3 ovrp pd cy6 2016-11-10
4 ovrp cr cy4 2016-08-02
4 ovrp pr cy5 2016-11-10
5 ovrp cr cy4 2016-08-02
5 ovrp pr cy5 2016-11-10
5 ovrp sd dvs 2016-11-10
;
proc sort data=ONE out=SORTED(index=(ID_ORRES=(ID ORRES)));
by ID DATE VISIT;
run;
data TWO;
set SORTED;
by ID;
%* init;
if first.ID then ID_DONE=0;
if ID_DONE=0;
%* case 1 ;
if VISIT='cy6' then do;
ID_DONE+1;
output;
return;
end;
%* case 2 ;
NEXTOBS=_N_+1;
set SORTED(keep=ID ORRES rename=(ID=NEXT_ID ORRES=NEXT_ORRES) ) point=NEXTOBS;
if ID=NEXT_ID and NEXT_ORRES='pd' then do;
ID_DONE+1;
output;
return;
end;
%* case 3 ;
if last.ID then do ORRES='cr','pr','sd','pd','ne';
set SORTED key=ID_ORRES ;
if _IORC_ = 0 then do;
output;
leave;
end;
end;
keep ID TEST ORRES VISIT DATE ;
run;
ID | TEST | ORRES | VISIT | DATE |
---|---|---|---|---|
1 | ovrp | pr | cy6 | 2016-11-10 |
2 | ovrp | sd | cy4 | 2016-08-02 |
3 | ovrp | pr | cy5 | 2016-11-10 |
4 | ovrp | cr | cy4 | 2016-08-02 |
5 | ovrp | cr | cy4 | 2016-08-02 |
Assig a new variable ( a sort key) depending on ORRES so that sort wiil output the data in the right, required, order.
After sort output the first or last obs in ID, depending on the code order:
/* order required: order cr>pr>sd>pd>ne */
proc format lib=work;
value $sort
'cr' = '1'
'pr' = '2'
'sd' = '3'
'pd' = '4'
'ne' = '5'
; run;
data temp;
set one;
sort_key = put(orres, $sort1.);
run;
proc sort data=temp; by ID sort_key; run;
data want;
set temp;
by ID sort_key;
if first.sort_key then output; /* assumed CR is the best */
/* if last.sort_key then output; ** assumed NE is the best */
run;
Thank you very much for the help. It will take some time for me to understand your code.
The output for id=3 should be orres='cr'. But the output i am getting is orres='pr'. As per the order((cr>pr>sd>pd>ne) I have to take the best response if present. Thank you
No. You said the order((cr>pr>sd>pd>ne) is only if cy6 is not found. There is cy6 for ID =3.
Sorry for the confusion.
My specs says: I have to look until "first orres='pd' or visit='cy6' " which ever comes first then I have pick from the order (cr>pr>sd>pd>ne). That means, if there above 'cr ' and 'pr' until then aval='cr' because that is the best observed response till then.
If there are no orres='pd' and visit='cy6' then I have to take last obs.
Thank you very much
This matches the new specs, but still not your original output.
data ONE ;
input ID TEST $ ORRES $ VISIT $ DATE $10.;
datalines;
1 ovrp sd cy5 2016-08-02
1 ovrp pr cy6 2016-11-10
1 ovrp pd cy7 2016-12-10
2 ovrp sd cy4 2016-08-02
2 ovrp pd cy5 2016-11-10
2 ovrp pd cy6 2016-11-10
3 ovrp cr cy4 2016-08-02
3 ovrp pr cy5 2016-11-10
3 ovrp pd cy6 2016-11-10
4 ovrp cr cy4 2016-08-02
4 ovrp pr cy5 2016-11-10
5 ovrp cr cy4 2016-08-02
5 ovrp pr cy5 2016-11-10
5 ovrp sd dvs 2016-11-10
;
proc sort data=ONE out=SORTED(index=(ID_ORRES=(ID ORRES)));
by ID DATE VISIT;
run;
data TWO;
set SORTED;
by ID;
%* init;
if first.ID then ID_DONE=0;
if ID_DONE=0;
ORDER=6-whichc(ORRES,'cr','pr','sd','pd','ne');
output;
if VISIT='cy6' then do;
ID_DONE+1;
return;
end;
NEXTOBS=_N_+1;
set SORTED(keep=ID ORRES rename=(ID=NEXT_ID ORRES=NEXT_ORRES) ) point=NEXTOBS;
if ID=NEXT_ID and NEXT_ORRES='pd' then do;
ID_DONE+1;
return;
end;
if last.ID then do;
ORDER=9e9;
output;
end;
keep ID TEST ORRES VISIT DATE ORDER;
run;
proc sql;
create table THREE as
select ID, TEST, ORRES, VISIT, DATE
from TWO
group by ID
having ORDER=max(ORDER);
quit;
ID | TEST | ORRES | VISIT | DATE |
---|---|---|---|---|
1 | ovrp | pr | cy6 | 2016-11-10 |
2 | ovrp | sd | cy4 | 2016-08-02 |
3 | ovrp | cr | cy4 | 2016-08-02 |
4 | ovrp | pr | cy5 | 2016-11-10 |
5 | ovrp | sd | dvs | 2016-11-10 |
Thank you very much for the great help. I just modified a little to get the output i need.
In the code :
I just changed order=9e9 to 6-whichc(ORRES,'cr','pr','sd','pd','ne');
if last.ID then do;
ORDER=6-whichc(ORRES,'cr','pr','sd','pd','ne');
output;
end;
Sorry, I should have read my specs clearly. It says "If there are no orres='pd' and visit='cy6'" then I have to look until last obs and take best response till then(cr>pr>sd>pd>ne).
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.