I've been struggling with a problem at work and was hoping that one of you might offer some advice.
This is the data set I have (variable names in the top line, triple underscores indicate missing observations) :
ID_ AGE Ja1 Ja2 Ja3
001 003 207 ___ 506
003 023 445 211 241
021 101 ___ 431 246
145 042 566 343 121
I need to use SAS to change the first data set into a data set that looks like this (ie. all of the observations have been shifted into new set of variables so that all of the missing observations are on the right-hand and all of the values are on the left):
ID_ AGE Ja1 Ja2 Ja3
001 003 207 506 ___
003 023 445 211 241
021 101 431 246 ___
145 042 566 343 121
(edit: corrected the second data set.)
One extra note: - It may not affect the solution, but there are about 200 rows in the original data set and the Ja* variables currently extend from Ja1-Ja7, but are increasing over time.
Any thoughts or recommendations?
You mention that the number of JAn variables will increase over time. Perhaps a method that uses a simple SAS variable list and what some call "infile magic". The other methods suggested could accomplish a similar generality using a SAS Variable List and the DIM function it still may be of some interest to see a little magic.
The values of the JA variables are written to the infile buffer with OPTIONS MISSING=' ' then the JA variables are read using INFILE option MISSOVER magically moving the missing value to the right.
data ja;
infile cards firstobs=2;
input id:$3. age ja1-ja3;
cards;
ID_ AGE Ja1 Ja2 Ja3
001 003 207 . 506
003 023 445 211 241
021 101 . 431 246
145 042 566 343 121
;;;;
run;
proc print;
run;
filename FT15F001 temp;
options missing=' ';
data magic;
infile FT15F001 missover;
input @;
do while(not eof);
set ja end=eof;
_infile_ = catx(' ',of ja:);
input @1 (ja:)(:) @;
output;
end;
stop;
parmcards;
Necessary evil
;;;;
run;
proc print;
run;
Both of your examples are exactly the same? Please also post test data in the form of a datastep so we don't have to spend time typing it all in. Should be simply looping over an array, but provide test data and updated output first.
data have;
input
ID_ : $3.
AGE : 8.
Ja1 - Ja3 : 8.;
cards;
001 003 207 . 506
003 023 445 211 241
021 101 . . 246
145 042 566 343 .
;
run;
%let a_len = 3;
data want(drop=z i l y);
set have;
array ja {&a_len.} ja1-ja3;
do z = 1 to &a_len.;
i=sum(i,1);
l=sum(l,1);
if ja{i} = . then do y = 1 to &a_len. while (l < &a_len.);
l+1;
if ja{l} = . then l+1;
else leave;
end;
if i <= &a_len. and l <= &a_len. then ja{i} = ja{l};
if l > &a_len. then ja{i} = .;
end;
output;
call missing (of _all_);
run;
Your examples are exactly the same, but I think this is what you are trying to describe and achieve.
Ah, in which case, you can simplfy the problem by looping backwards across the dataset.
data want (drop=i); set have; array ja{3}; do i=3 to 2 by -1; if ja{i} ne . and ja{i-1}=. then do; ja{i-1}=ja{i}; ja{i}=.; end; end; run;
Or slightly fewer keypresses:
data want (drop=i tmp);
set have;
array ja{3};
tmp=compress(tranwrd(catx(',',of ja{*}),".,",""));
do i=1 to 3;
ja{i}=input(scan(tmp,i,","),best.);
end;
run;
You mention that the number of JAn variables will increase over time. Perhaps a method that uses a simple SAS variable list and what some call "infile magic". The other methods suggested could accomplish a similar generality using a SAS Variable List and the DIM function it still may be of some interest to see a little magic.
The values of the JA variables are written to the infile buffer with OPTIONS MISSING=' ' then the JA variables are read using INFILE option MISSOVER magically moving the missing value to the right.
data ja;
infile cards firstobs=2;
input id:$3. age ja1-ja3;
cards;
ID_ AGE Ja1 Ja2 Ja3
001 003 207 . 506
003 023 445 211 241
021 101 . 431 246
145 042 566 343 121
;;;;
run;
proc print;
run;
filename FT15F001 temp;
options missing=' ';
data magic;
infile FT15F001 missover;
input @;
do while(not eof);
set ja end=eof;
_infile_ = catx(' ',of ja:);
input @1 (ja:)(:) @;
output;
end;
stop;
parmcards;
Necessary evil
;;;;
run;
proc print;
run;
John King(data _null_),
Good Code, which remind me the code written by TOM before .
1)What if there are lots of variables ,and your
catx(' ',of ja:);
can't hold them all together ?
2)And what if the mssing value is like .A - .Z ?
Regards
@Ksharp you know the answer to both of those questions. There would have to be an very large number of variables break it and I doubt that anyone else in this tread even knows special missing values are.
data ja;
infile cards firstobs=2;
input id:$3. age ja1-ja3;
cards;
ID_ AGE Ja1 Ja2 Ja3
001 003 207 . 506
003 023 445 211 241
021 101 . 431 246
145 042 566 343 121
;;;;
run;
data want;
set ja;
array j{*} ja1-ja3;
array x{3} _temporary_;
n=0;call missing(of x{*});
do i=1 to dim(j);
if not missing(j{i}) then do;
n+1;x{n}=j{i};
end;
end;
do i=1 to dim(j);
j{i}=x{i};
end;
drop i n;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.