BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
chouchou
Calcite | Level 5

Hi guys, I just don't use proc transpose but using data steps, how to convert the data have -->data want?

Data have:

Obs Subj  Score

1      001  7

2      001  6

3      001  5

4      001  5

5      001  4

6      002  8

7      002  7

8      002  6

9      002  6

10    002  6

Data want:

Obs Subj S1 S2 S3 S4 S5

1      001 7 6 5 5 4

2      002 8 7 6 6 6

3      003 8 7 6 6 5

I wrote the following codes:

proc sort data=one;

by Subj;

run;

data want;

   set have;

   by Subj;

  array S{5};

  do i=1 to 5;

  retain S1-S5;

  if first.Subj then call missing(of S1-S5);

  S{i}=Score;

  if last.Subj then output;

  end;

  run;

It comes out 3*5=15 observations, I guess the problem is raised by do loop, but how can I justify it?

Thank you so much.

Chouchou

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

If you just logically step though your program you will see the logic error.

When it reads the first observation from HAVE it will execute the DO loop 5 times and copy the same value into all five variables in the array.

Then when it reads the next observation from HAVE it will do the same thing.

One simple pattern that can help with this is what is called a DOW loop. There are a number of papers you can read on it. Here is a recent one:

http://support.sas.com/resources/papers/proceedings14/1619-2014.pdf

It eliminates the need for the RETAIN and the CALL MISSING().

%let max=5 ;

data want ;

  do i=1 by 1 until (last.subj);

    set have ;

     by subj ;

     array s (&max);

     s(i) = score;

   end;

run;

You might want to add a step to pre-calculate the maximum size to use for your array. 

proc sql noprint ;

   select max(count) into :max separated by ' '

    from (select subj,count(*) as count from have group by subj)

  ;

quit;

View solution in original post

10 REPLIES 10
stat_sas
Ammonite | Level 13

proc sort data=have;
by Subj;
run;

data want;
  do _n_=1 to 5;
   set have;
   by Subj;  
   array S{5};
   S{_n_}=Score;
end;
run;

chouchou
Calcite | Level 5

The result is

Subj Score S1 S2 S3 S4 S5

001     4     7   6    5   5    4

just one obs, not twoSmiley Sad

stat_sas
Ammonite | Level 13

Not sure, but I am getting two obs

Data have;
input Obs Subj  Score;
datalines;
1      001  7
2      001  6
3      001  5
4      001  5
5      001  4
6      002  8
7      002  7
8      002  6
9      002  6
10     002  6
;

proc sort data=have;
by Subj;
run;

data want;
  do _n_=1 to 5;
   set have;
   by Subj;  
   array S{5};
   S{_n_}=Score;
end;
run;

chouchou
Calcite | Level 5

It's pretty weird. I just copied your code and it resolved to one obs.

:smileyconfused:

Reeza
Super User

Remove the do loop entirely but add another counter so you know what record your on for each id.

Tom
Super User Tom
Super User

If you just logically step though your program you will see the logic error.

When it reads the first observation from HAVE it will execute the DO loop 5 times and copy the same value into all five variables in the array.

Then when it reads the next observation from HAVE it will do the same thing.

One simple pattern that can help with this is what is called a DOW loop. There are a number of papers you can read on it. Here is a recent one:

http://support.sas.com/resources/papers/proceedings14/1619-2014.pdf

It eliminates the need for the RETAIN and the CALL MISSING().

%let max=5 ;

data want ;

  do i=1 by 1 until (last.subj);

    set have ;

     by subj ;

     array s (&max);

     s(i) = score;

   end;

run;

You might want to add a step to pre-calculate the maximum size to use for your array. 

proc sql noprint ;

   select max(count) into :max separated by ' '

    from (select subj,count(*) as count from have group by subj)

  ;

quit;

Loko
Barite | Level 11

Hello,

Your code gives 15 observation in the output because you have the code:

if last.Subj then output;

inside the do loop therefore when the condition is true it writes 5 observations to the output , one for every step of do loop.

A number of solutions have been presented here therefore you have a choice Smiley Happy

chouchou
Calcite | Level 5

Hi loko, can you please explain me why it has both 4 and 5 for Subj001, while it only has 6 for Subj002? I would greatly appreciate your help.

Subj    Score    S1    S2    S3    S4    S5    i

001     4    4   5    5    5    5    1

001    4    4    4   5    5    5    2

001    4    4    4    4   5    5    3

001    4    4    4    4    4   5    4

001    4    4    4    4    4    4    5

002    6    6    6    6    6    6    1

002    6    6    6    6    6    6    2

002   6    6    6    6    6    6    3

002   6    6    6    6    6    6    4

002   6    6    6    6    6    6    5

BrunoMueller
SAS Super FREQ

Hi

You can use the DATA Step loop to fill the array elements. To make it more dynamic you can find out first how many array elements you need. See sample code below:

data have;
  infile cards;
 
input
    Subj :
8.
    Score :
8.
  ;
  cards;
001 7
001 6
001 5
001 5
001 4
002 8
002 7
002 6
002 6
;

proc sql noprint;
 
select
    max( maxN )
into :maxN trimmed
 
from (
   
select
      subj
      , count(*)
as maxN
   
from
      have
   
group by
      subj
      )
  ;
quit;
%put NOTE: &=maxN;

data want;
  set have;
  by Subj;
  array S{&maxN};
  retain S1 - S&maxN;

 
if first.Subj then do;
   
call missing(of S{*});
    i = 0;
 
end;

  i +
1;
  S{i} = Score;

 
if last.Subj then do;
   
output;
 
end;

 
keep subj S1 - S&maxN;
run;
Ksharp
Super User
data have;
  infile cards; 
  input
    Subj : 8.
    Score : 8.
  ;
  cards; 
001 7
001 6
001 5
001 5
001 4
002 8
002 7
002 6
002 6
; 
run;
data have;
 set have;
 by subj;
 if first.subj then n=0;
 n+1;
 run;
proc sort data=have(keep=n) out=key nodupkey;by n;run;
data _null_;
 set key end=last;
 if _n_ eq 1 then call execute('data want;merge ');
 call execute(cats('have(rename=(Score=Score',n,') where=(n=',n,') )'));
 if last then call execute(';by Subj;drop n;run;');
run;

Xia Keshan

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1455 views
  • 7 likes
  • 7 in conversation