Hi all,
I have data that looks like this:
subid res1979 res1980 res1981 res1982 res1983 res1984 res1985 res1986 res1987 res1988 res1989
1 . . . 1 1 1 1 2 2 2 2
2 1 1 1 1 1 1 1 1 1 1 1
3 . . . . . . . . 1 1 1
4 . 3 3 3 3 3 3 2 2 2 2
5 . . . . . . . . . . .
n . . . . . 2 2 2 3 3 1
(n = 10,000 subjects)
And I would like it to look like this:
subid firstres
1 1
2 1
3 1
4 3
5 .
n 2
And if possible, like this:
subid firstres yr
1 1 1982
2 1 1979
3 1 1987
4 3 1980
5 . .
n 2 1984
I tried creating an array of the res variables but I do not know how to write the do loop statement to select the first instance of a non-missing column variable.
I am using SAS 9.4.
Any help would be much appreciated.
Regards,
data have;
input subid res1979 res1980 res1981 res1982 res1983 res1984 res1985 res1986 res1987 res1988 res1989 ;
cards;
1 . . . 1 1 1 1 2 2 2 2
2 1 1 1 1 1 1 1 1 1 1 1
3 . . . . . . . . 1 1 1
4 . 3 3 3 3 3 3 2 2 2 2
5 . . . . . . . . . . .
6 . . . . . 2 2 2 3 3 1
;
data want;
set have;
array t(*) res:;
firstres=coalesce(of t(*));
temp=whichn(firstres,of t(*));
if not missing(temp) then yr=substr(vname(t(temp)),4);
drop res: temp;
run;
data have;
input subid res1979 res1980 res1981 res1982 res1983 res1984 res1985 res1986 res1987 res1988 res1989 ;
cards;
1 . . . 1 1 1 1 2 2 2 2
2 1 1 1 1 1 1 1 1 1 1 1
3 . . . . . . . . 1 1 1
4 . 3 3 3 3 3 3 2 2 2 2
5 . . . . . . . . . . .
6 . . . . . 2 2 2 3 3 1
;
data want;
set have;
array t(*) res:;
firstres=coalesce(of t(*));
temp=whichn(firstres,of t(*));
if not missing(temp) then yr=substr(vname(t(temp)),4);
drop res: temp;
run;
Hi PROC Star,
Thank you so much for your help.
At first the code you wrote yielded a warning that said:
WARNING: Defining an array with zero elements
So I just slightly adjusted the code (removing res: and replacing it with CRES1979-CRES2014, which are the names of the vars) and it worked perfectly!
This is my corrected code for others to see:
DATA res;
set temp;
array t(*) CRES1979-CRES2014; *names of the variables I wanted to select the first non-missing from;
firstres=coalesce(of t(*));
other=whichn(firstres,of t(*));
if not missing(other) then yr=substr(vname(t(other)),5);
drop CRES1979-CRES2014 other; *I also dropped the 'other' variable as I didn't need it;
RUN;
Thanks so much!
I'm glad. @Amihic all the best and take care. I used stuff to deal with your input data. But you are smart to modify . great!!!!!!!!!
Transpose the data, delete the missing and then add the numbers? This way you don't need to know which is first.
But does your data match your question??
1 1 1981 -> subject 1 does not have a 1 for 1981.
1. PROC TRANSPOSE
2. Remove missing (could do that in a data step or a data set option with PROC TRANSPOSE).
3. Extract year from the variable name using SCAN() either in same data step.
Untested:
proc transpose data=have out=long(where=(res1 ne .)); prefix=RES; * you may need to confirm column names;
by subid;
var res: ; *colon selects all variables starting with RES;
run;
data want;
set long;
year = input(compress(_name_, , 'ka'), year4.);
format year year4.;
run;
@Amihic wrote:
Hi all,
I have data that looks like this:
subid res1979 res1980 res1981 res1982 res1983 res1984 res1985 res1986 res1987 res1988 res1989
1 . . . 1 1 1 1 2 2 2 2
2 1 1 1 1 1 1 1 1 1 1 1
3 . . . . . . . . 1 1 1
4 . 3 3 3 3 3 3 2 2 2 2
5 . . . . . . . . . . .
n . . . . . 2 2 2 3 3 1
(n = 10,000 subjects)
And I would like it to look like this:
subid firstres
1 1
2 1
3 1
4 3
5 .
n 2
And if possible, like this:
subid firstres yr
1 1 1981
2 1 1979
3 1 1985
4 3 1980
5 . .
n 2 1984
I tried creating an array of the res variables but I do not know how to write the do loop statement to select the first instance of a non-missing column variable.
I am using SAS 9.4.
Any help would be much appreciated.
Regards,
Hi Reeza,
Thank you for the help. Though I didn't use your approach, your code was very helpful for me to understand this particular point:
var res: ; *colon selects all variables starting with RES;
The code provided by PROC Star (above) included a res: ; that I didn't understand at first, but because of your explanation, I can now understand it.
Thanks!
@Amihic You could credit the help you received by marking their solution as accepted
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.