BookmarkSubscribeRSS Feed
rsva
Fluorite | Level 6
I am trying to manipulate the output dataset from survival analysis (used Proc LifeTest). Below is the layout of my output. I have 13 age groups, survival months (0 to n), probability values. The survival months column has two problems: (1) it has missing months e.g. age <13 has 0-5 then 7-9 and so on (2) n is not constant for all age groups.

AGE SURMOS SURVIVAL
G1 0 0.1
G1 1 0.2
G1 3 0.6
G1 6 0.12
G2 0 0.01
G2 1 0.6
G2 5 0.12
G2 6 0.33
G2 10 0.88

I am trying to create a dataset (1) with 0 to n months (whatever is maximum for that particular age group) without missing for all age groups (2) to use the previous row value for probability for the missing. Below if the desired output I am trying to achieve.

AGE SURMOS SURVIVAL
G1 0 0.1
G1 1 0.2
G1 2 0.2
G1 3 0.6
G1 4 0.6
G1 5 0.6
G1 6 0.12
G2 0 0.01
G2 1 0.6
G2 2 0.6
G2 3 0.6
G2 4 0.6
G2 5 0.12
G2 6 0.33
G2 7 0.33
G2 8 0.33
G2 9 0.33
G2 10 0.88
6 REPLIES 6
Daryl
SAS Employee
You could try something like this.

[pre]
data test;
input age $ surmos survival;
datalines;
G1 0 0.1
G1 1 0.2
G1 3 0.6
G1 6 0.12
G2 0 0.01
G2 1 0.6
G2 5 0.12
G2 6 0.33
G2 10 0.88
;
run;

proc sort data=test;
by age surmos;
run;

data test2 (keep = age surmos survival);
set test;
by age;
retain last_surmos last_survival;
if first.age then do;
output;
last_surmos = surmos;
last_survival = survival;
end;
else do;
if ((surmos - last_surmos) LE 1) then do;
output;
last_surmos = surmos;
last_survival = survival;
end;
else do;
current_surmos = surmos;
current_survival = survival;
do i = (last_surmos + 1) to (surmos - 1);
last_surmos = last_surmos + 1;
surmos = last_surmos;
survival = last_survival;
output;
end;
surmos = current_surmos;
survival = current_survival;
output;
last_surmos = surmos;
last_survival = survival;
end;
end;
run;

[/pre] Message was edited by: Daryl
rsva
Fluorite | Level 6
Thanks Daryl. Will try it.
Patrick
Opal | Level 21
Hi

Same result just a bit a different coding approach taken in the data step:

data have;
input age $ surmos survival;
datalines;
G1 0 0.1
G1 1 0.2
G1 3 0.6
G1 6 0.12
G2 0 0.01
G2 1 0.6
G2 5 0.12
G2 6 0.33
G2 10 0.88
;
run;

proc sort data=have;
by age surmos;
run;

data want(drop=HaveSurmos LagSurvival HaveSurvival);
set have(rename=(surmos=HaveSurmos survival=HaveSurvival));
by age;

LagSurvival= lag(HaveSurvival);

if first.age then Surmos=HaveSurmos;

do while(Surmos le HaveSurmos);
if Surmos = HaveSurmos then survival=HaveSurvival;
else survival=LagSurvival;
output;
Surmos+1;
end;

run;

proc print data=want;
run;

HTH
Patrick

Message was edited by: Patrick
Ksharp
Super User
Yes.I also get it.But Patrick's code is more clever about rename origin variable.
[pre]
data test;
input age $ surmos survival;
datalines;
G1 0 0.1
G1 1 0.2
G1 3 0.6
G1 6 0.12
G2 0 0.01
G2 1 0.6
G2 5 0.12
G2 6 0.33
G2 10 0.88
;
run;
data temp;
set test;
retain _surmos _survival;
if age ne lag(age) then _surmos = surmos;
_surmos+1;
if _surmos lt surmos then do;
do while(_surmos lt surmos);
output;
_surmos+1;
end;
end;
_survival=survival; _surmos=surmos;
drop survival surmos;
output;
run;
proc print noobs;
run;
[/pre]



Ksharp
rsva
Fluorite | Level 6
Thanks to all who responded. Below is another alternative solution.

data test;
input age $ surmos survival;
datalines;
G1 0 0.1
G1 1 0.2
G1 3 0.6
G1 6 0.12
G2 0 0.01
G2 1 0.6
G2 5 0.12
G2 6 0.33
G2 10 0.88
;
run;

proc sort data= TEST; by AGE ; run;

/*gives max. survival months for each group variables*/
proc means data=outs noprint max ;
by AGE ;
var surmos ;
where _censor_=0;
output out= maxmos(drop=_type_ _freq_) max = maxmos ;
run ;

/*adds missing months for each level in the group variable*/
data temp1;
set maxmos;
if maxmos=0.5 then do;
surmos=0; output;
surmos=maxmos; output;
end;
else do;
surmos=0; output;
surmos=0.5; output;
do surmos= 1 to maxmos; output;
end;
end;
drop maxmos;
run;

proc sort data=temp1;
by AGE surmos;
run;

data temp2;
merge temp1 outs;
by AGE surmos;

retain s1 ;
if surmos=0 then do;
s1=1; slcl=1; sucl=1; end;

else if survival ne . then do;
s1=survival;
end;
run;
data_null__
Jade | Level 19
I think all you need is to look ahead to the next SURMOS.

[pre]
data test;
input age $ surmos survival;
datalines;
G1 0 0.1
G1 1 0.2
G1 3 0.6
G1 6 0.12
G2 0 0.01
G2 1 0.6
G2 5 0.12
G2 6 0.33
G2 10 0.88
;;;;
run;
data new;
set test end=eof;
by age;
if not eof then set test(firstobs=2 keep=surmos rename=(surmos=nxs));
drop nxs;
if not last.age then do;
do surmos=surmos to nxs-1;
output;
end;
end;
else output;
run;
proc print;
run;
[/pre]

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 749 views
  • 0 likes
  • 5 in conversation