Hi guys, I just don't use proc transpose but using data steps, how to convert the data have -->data want?
Data have:
Obs Subj Score
1 001 7
2 001 6
3 001 5
4 001 5
5 001 4
6 002 8
7 002 7
8 002 6
9 002 6
10 002 6
Data want:
Obs Subj S1 S2 S3 S4 S5
1 001 7 6 5 5 4
2 002 8 7 6 6 6
3 003 8 7 6 6 5
I wrote the following codes:
proc sort data=one;
by Subj;
run;
data want;
set have;
by Subj;
array S{5};
do i=1 to 5;
retain S1-S5;
if first.Subj then call missing(of S1-S5);
S{i}=Score;
if last.Subj then output;
end;
run;
It comes out 3*5=15 observations, I guess the problem is raised by do loop, but how can I justify it?
Thank you so much.
Chouchou
If you just logically step though your program you will see the logic error.
When it reads the first observation from HAVE it will execute the DO loop 5 times and copy the same value into all five variables in the array.
Then when it reads the next observation from HAVE it will do the same thing.
One simple pattern that can help with this is what is called a DOW loop. There are a number of papers you can read on it. Here is a recent one:
http://support.sas.com/resources/papers/proceedings14/1619-2014.pdf
It eliminates the need for the RETAIN and the CALL MISSING().
%let max=5 ;
data want ;
do i=1 by 1 until (last.subj);
set have ;
by subj ;
array s (&max);
s(i) = score;
end;
run;
You might want to add a step to pre-calculate the maximum size to use for your array.
proc sql noprint ;
select max(count) into :max separated by ' '
from (select subj,count(*) as count from have group by subj)
;
quit;
proc sort data=have;
by Subj;
run;
data want;
do _n_=1 to 5;
set have;
by Subj;
array S{5};
S{_n_}=Score;
end;
run;
The result is
Subj Score S1 S2 S3 S4 S5
001 4 7 6 5 5 4
just one obs, not two
Not sure, but I am getting two obs
Data have;
input Obs Subj Score;
datalines;
1 001 7
2 001 6
3 001 5
4 001 5
5 001 4
6 002 8
7 002 7
8 002 6
9 002 6
10 002 6
;
proc sort data=have;
by Subj;
run;
data want;
do _n_=1 to 5;
set have;
by Subj;
array S{5};
S{_n_}=Score;
end;
run;
It's pretty weird. I just copied your code and it resolved to one obs.
:smileyconfused:
Remove the do loop entirely but add another counter so you know what record your on for each id.
If you just logically step though your program you will see the logic error.
When it reads the first observation from HAVE it will execute the DO loop 5 times and copy the same value into all five variables in the array.
Then when it reads the next observation from HAVE it will do the same thing.
One simple pattern that can help with this is what is called a DOW loop. There are a number of papers you can read on it. Here is a recent one:
http://support.sas.com/resources/papers/proceedings14/1619-2014.pdf
It eliminates the need for the RETAIN and the CALL MISSING().
%let max=5 ;
data want ;
do i=1 by 1 until (last.subj);
set have ;
by subj ;
array s (&max);
s(i) = score;
end;
run;
You might want to add a step to pre-calculate the maximum size to use for your array.
proc sql noprint ;
select max(count) into :max separated by ' '
from (select subj,count(*) as count from have group by subj)
;
quit;
Hello,
Your code gives 15 observation in the output because you have the code:
if last.Subj then output;
inside the do loop therefore when the condition is true it writes 5 observations to the output , one for every step of do loop.
A number of solutions have been presented here therefore you have a choice
Hi loko, can you please explain me why it has both 4 and 5 for Subj001, while it only has 6 for Subj002? I would greatly appreciate your help.
Subj Score S1 S2 S3 S4 S5 i
001 4 4 5 5 5 5 1
001 4 4 4 5 5 5 2
001 4 4 4 4 5 5 3
001 4 4 4 4 4 5 4
001 4 4 4 4 4 4 5
002 6 6 6 6 6 6 1
002 6 6 6 6 6 6 2
002 6 6 6 6 6 6 3
002 6 6 6 6 6 6 4
002 6 6 6 6 6 6 5
Hi
You can use the DATA Step loop to fill the array elements. To make it more dynamic you can find out first how many array elements you need. See sample code below:
data have; infile cards; input Subj : 8. Score : 8. ; cards; 001 7 001 6 001 5 001 5 001 4 002 8 002 7 002 6 002 6 ; run; data have; set have; by subj; if first.subj then n=0; n+1; run; proc sort data=have(keep=n) out=key nodupkey;by n;run; data _null_; set key end=last; if _n_ eq 1 then call execute('data want;merge '); call execute(cats('have(rename=(Score=Score',n,') where=(n=',n,') )')); if last then call execute(';by Subj;drop n;run;'); run;
Xia Keshan
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.