I've been struggling with a problem at work and was hoping that one of you might offer some advice.
This is the data set I have (variable names in the top line, triple underscores indicate missing observations) :
ID_ AGE Ja1 Ja2 Ja3
001 003 207 ___ 506
003 023 445 211 241
021 101 ___ 431 246
145 042 566 343 121
I need to use SAS to change the first data set into a data set that looks like this (ie. all of the observations have been shifted into new set of variables so that all of the missing observations are on the right-hand and all of the values are on the left):
ID_ AGE Ja1 Ja2 Ja3
001 003 207 506 ___
003 023 445 211 241
021 101 431 246 ___
145 042 566 343 121
(edit: corrected the second data set.)
One extra note: - It may not affect the solution, but there are about 200 rows in the original data set and the Ja* variables currently extend from Ja1-Ja7, but are increasing over time.
Any thoughts or recommendations?
You mention that the number of JAn variables will increase over time. Perhaps a method that uses a simple SAS variable list and what some call "infile magic". The other methods suggested could accomplish a similar generality using a SAS Variable List and the DIM function it still may be of some interest to see a little magic.
The values of the JA variables are written to the infile buffer with OPTIONS MISSING=' ' then the JA variables are read using INFILE option MISSOVER magically moving the missing value to the right.
data ja;
infile cards firstobs=2;
input id:$3. age ja1-ja3;
cards;
ID_ AGE Ja1 Ja2 Ja3
001 003 207 . 506
003 023 445 211 241
021 101 . 431 246
145 042 566 343 121
;;;;
run;
proc print;
run;
filename FT15F001 temp;
options missing=' ';
data magic;
infile FT15F001 missover;
input @;
do while(not eof);
set ja end=eof;
_infile_ = catx(' ',of ja:);
input @1 (ja:)(:) @;
output;
end;
stop;
parmcards;
Necessary evil
;;;;
run;
proc print;
run;
Both of your examples are exactly the same? Please also post test data in the form of a datastep so we don't have to spend time typing it all in. Should be simply looping over an array, but provide test data and updated output first.
data have;
input
ID_ : $3.
AGE : 8.
Ja1 - Ja3 : 8.;
cards;
001 003 207 . 506
003 023 445 211 241
021 101 . . 246
145 042 566 343 .
;
run;
%let a_len = 3;
data want(drop=z i l y);
set have;
array ja {&a_len.} ja1-ja3;
do z = 1 to &a_len.;
i=sum(i,1);
l=sum(l,1);
if ja{i} = . then do y = 1 to &a_len. while (l < &a_len.);
l+1;
if ja{l} = . then l+1;
else leave;
end;
if i <= &a_len. and l <= &a_len. then ja{i} = ja{l};
if l > &a_len. then ja{i} = .;
end;
output;
call missing (of _all_);
run;
Your examples are exactly the same, but I think this is what you are trying to describe and achieve.
Ah, in which case, you can simplfy the problem by looping backwards across the dataset.
data want (drop=i); set have; array ja{3}; do i=3 to 2 by -1; if ja{i} ne . and ja{i-1}=. then do; ja{i-1}=ja{i}; ja{i}=.; end; end; run;
Or slightly fewer keypresses:
data want (drop=i tmp);
set have;
array ja{3};
tmp=compress(tranwrd(catx(',',of ja{*}),".,",""));
do i=1 to 3;
ja{i}=input(scan(tmp,i,","),best.);
end;
run;
You mention that the number of JAn variables will increase over time. Perhaps a method that uses a simple SAS variable list and what some call "infile magic". The other methods suggested could accomplish a similar generality using a SAS Variable List and the DIM function it still may be of some interest to see a little magic.
The values of the JA variables are written to the infile buffer with OPTIONS MISSING=' ' then the JA variables are read using INFILE option MISSOVER magically moving the missing value to the right.
data ja;
infile cards firstobs=2;
input id:$3. age ja1-ja3;
cards;
ID_ AGE Ja1 Ja2 Ja3
001 003 207 . 506
003 023 445 211 241
021 101 . 431 246
145 042 566 343 121
;;;;
run;
proc print;
run;
filename FT15F001 temp;
options missing=' ';
data magic;
infile FT15F001 missover;
input @;
do while(not eof);
set ja end=eof;
_infile_ = catx(' ',of ja:);
input @1 (ja:)(:) @;
output;
end;
stop;
parmcards;
Necessary evil
;;;;
run;
proc print;
run;
John King(data _null_),
Good Code, which remind me the code written by TOM before .
1)What if there are lots of variables ,and your
catx(' ',of ja:);
can't hold them all together ?
2)And what if the mssing value is like .A - .Z ?
Regards
@Ksharp you know the answer to both of those questions. There would have to be an very large number of variables break it and I doubt that anyone else in this tread even knows special missing values are.
data ja;
infile cards firstobs=2;
input id:$3. age ja1-ja3;
cards;
ID_ AGE Ja1 Ja2 Ja3
001 003 207 . 506
003 023 445 211 241
021 101 . 431 246
145 042 566 343 121
;;;;
run;
data want;
set ja;
array j{*} ja1-ja3;
array x{3} _temporary_;
n=0;call missing(of x{*});
do i=1 to dim(j);
if not missing(j{i}) then do;
n+1;x{n}=j{i};
end;
end;
do i=1 to dim(j);
j{i}=x{i};
end;
drop i n;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.