Hello,
I would like to create a variable, "LastStress", that would pull the last non-missing observation from a series of variables that have the same prefix and different suffixes that correlate to visits (my data is in a "wide" format).
Example
Have:
ID Stress1 Stress5 Stress10 Stress15 Stress20
1 10 8 . 3 .
2 3 2 . . .
3 15 . . . 10
Want:
ID Stress1 Stress5 Stress10 Stress15 Stress20 LastStress
1 10 8 . 3 . 3
2 3 2 . . . 2
3 15 . . . 10 10
Any help on where to start with the initial code would be very much appreciated. Thank you!
I would suggest avoiding constructing and traversing a loop altogether. This task is what the COALESCE function was made for:
data have;
input ID Stress1 Stress5 Stress10 Stress15 Stress20;
cards;
1 10 8 . 3 .
2 3 2 . . .
3 15 . . . 10
;
data want;
set have;
laststress=coalesce(stress20,stress15,stress10,stress5,stress1);
run;
Of course, the above assume you know in advance the names of the individual stress variables. If you don't know those names, or just want to generalize your solution, you can take a dip in the metadata via proc sql/dictionary, and make a macrovar (reverse_list below) listing the variables in reverse positional order:
proc sql noprint;
select name into: reverse_list separated by ','
from dictionary.columns where libname='WORK' and memname='HAVE'
and upcase(name) like 'STRESS%'
order by varnum descending;
quit;
%put &=reverse_list;
data want;
set have;
laststress=coalesce(&reverse_list);
run;
/* UNTESTED CODE */
data want;
set have;
array stress stress1 stress5 stress10 stress15 stress20;
do i=1 to dim(stress);
if not missing(stress(i)) then laststress=stress(i);
end;
drop i;
run;
HI @JackHP You just need to reverse the LOOP and exit
data have;
input ID Stress1 Stress5 Stress10 Stress15 Stress20;
cards;
1 10 8 . 3 .
2 3 2 . . .
3 15 . . . 10
;
data want;
set have;
array t stress1--stress20;
do _n_=dim(t) to 1 by -1;
if not missing(t(_n_)) then do;
laststress=t(_n_);
leave;
end;
end;
run;
And a little terser
data want;
set have;
array t stress1--stress20;
do _n_=dim(t) to 1 by -1 until(t(_n_)>.);
laststress=t(_n_);
end;
run;
Hi @JackHP
You can try this code:
/* Retrieve the list of variables with the prefix "Stress" */
proc contents data=have out=list_var (keep = name where=(lowcase(name) like 'stress%')) noprint;
run;
proc sql noprint;
select name into:list_var_stress separated by " " from list_var;
run;
/* Compute Laststress */
data want;
set have;
array stress (*) &list_var_stress;
do i=1 to dim(stress);
if stress(i) ne . then laststress=stress(i);
end;
drop i;
run;
I would suggest avoiding constructing and traversing a loop altogether. This task is what the COALESCE function was made for:
data have;
input ID Stress1 Stress5 Stress10 Stress15 Stress20;
cards;
1 10 8 . 3 .
2 3 2 . . .
3 15 . . . 10
;
data want;
set have;
laststress=coalesce(stress20,stress15,stress10,stress5,stress1);
run;
Of course, the above assume you know in advance the names of the individual stress variables. If you don't know those names, or just want to generalize your solution, you can take a dip in the metadata via proc sql/dictionary, and make a macrovar (reverse_list below) listing the variables in reverse positional order:
proc sql noprint;
select name into: reverse_list separated by ','
from dictionary.columns where libname='WORK' and memname='HAVE'
and upcase(name) like 'STRESS%'
order by varnum descending;
quit;
%put &=reverse_list;
data want;
set have;
laststress=coalesce(&reverse_list);
run;
Thank you for sharing the coalesce function - this worked perfectly.
I used this to construct a macro as I wanted to repeat this with a series of variables (code below for anyone interested). Thank you!
%Macro lastvar(var);
data mvrx19.master;
set mvrx19.master;
last_&Var=coalesce(&var._20,&var._15,&var._10,&var._5,&var._1);
run;
%mend lastvar;
%lastvar(PSSNS);
%lastvar(PSS10NS);
%lastvar(PSS4NS);
%lastvar(dietfruit);
%lastvar(dietvet);
%lastvar(FI);
%Macro Charlastvar(var);
data mvrx19.master;
set mvrx19.master;
last_&Var=coalescec(&var._20,&var._15,&var._10,&var._5,&var._1);
run;%mend Charlastvar;
%Charlastvar(Physhealth);
%Charlastvar(MFA);
If you don't mind listing all the stress variables or a two-step process, I'd defer to @mkeintz's offer. Otherwise, you can try:
data have ;
input id stress1 stress5 stress10 stress15 stress20 ;
cards ;
1 10 8 . 3 .
2 3 2 . . .
3 15 . . . 10
;
run ;
data want ;
set have ;
laststress = input (scan (catx (".", of stress:), -1), f.) ;
run ;
The idea is that if you delimit the values with a period, it becomes indistinguishable from the period representing a missing value after the implicit num-to-char conversion by CATX.
Kind regards
Paul D.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.