Help using Base SAS procedures

Using DATA STEP to convert several obs per subject to one obs with several subjects

Accepted Solution Solved
Reply
Contributor
Posts: 68
Accepted Solution

Using DATA STEP to convert several obs per subject to one obs with several subjects

Hi guys, I just don't use proc transpose but using data steps, how to convert the data have -->data want?

Data have:

Obs Subj  Score

1      001  7

2      001  6

3      001  5

4      001  5

5      001  4

6      002  8

7      002  7

8      002  6

9      002  6

10    002  6

Data want:

Obs Subj S1 S2 S3 S4 S5

1      001 7 6 5 5 4

2      002 8 7 6 6 6

3      003 8 7 6 6 5

I wrote the following codes:

proc sort data=one;

by Subj;

run;

data want;

   set have;

   by Subj;

  array S{5};

  do i=1 to 5;

  retain S1-S5;

  if first.Subj then call missing(of S1-S5);

  S{i}=Score;

  if last.Subj then output;

  end;

  run;

It comes out 3*5=15 observations, I guess the problem is raised by do loop, but how can I justify it?

Thank you so much.

Chouchou


Accepted Solutions
Solution
‎10-10-2014 01:01 AM
Super User
Super User
Posts: 7,039

Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

If you just logically step though your program you will see the logic error.

When it reads the first observation from HAVE it will execute the DO loop 5 times and copy the same value into all five variables in the array.

Then when it reads the next observation from HAVE it will do the same thing.

One simple pattern that can help with this is what is called a DOW loop. There are a number of papers you can read on it. Here is a recent one:

http://support.sas.com/resources/papers/proceedings14/1619-2014.pdf

It eliminates the need for the RETAIN and the CALL MISSING().

%let max=5 ;

data want ;

  do i=1 by 1 until (last.subj);

    set have ;

     by subj ;

     array s (&max);

     s(i) = score;

   end;

run;

You might want to add a step to pre-calculate the maximum size to use for your array. 

proc sql noprint ;

   select max(count) into :max separated by ' '

    from (select subj,count(*) as count from have group by subj)

  ;

quit;

View solution in original post


All Replies
Trusted Advisor
Posts: 1,228

Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

proc sort data=have;
by Subj;
run;

data want;
  do _n_=1 to 5;
   set have;
   by Subj;  
   array S{5};
   S{_n_}=Score;
end;
run;

Contributor
Posts: 68

Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

The result is

Subj Score S1 S2 S3 S4 S5

001     4     7   6    5   5    4

just one obs, not twoSmiley Sad

Trusted Advisor
Posts: 1,228

Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

Not sure, but I am getting two obs

Data have;
input Obs Subj  Score;
datalines;
1      001  7
2      001  6
3      001  5
4      001  5
5      001  4
6      002  8
7      002  7
8      002  6
9      002  6
10     002  6
;

proc sort data=have;
by Subj;
run;

data want;
  do _n_=1 to 5;
   set have;
   by Subj;  
   array S{5};
   S{_n_}=Score;
end;
run;

Contributor
Posts: 68

Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

It's pretty weird. I just copied your code and it resolved to one obs.

:smileyconfused:

Super User
Posts: 19,772

Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

Remove the do loop entirely but add another counter so you know what record your on for each id.

Solution
‎10-10-2014 01:01 AM
Super User
Super User
Posts: 7,039

Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

If you just logically step though your program you will see the logic error.

When it reads the first observation from HAVE it will execute the DO loop 5 times and copy the same value into all five variables in the array.

Then when it reads the next observation from HAVE it will do the same thing.

One simple pattern that can help with this is what is called a DOW loop. There are a number of papers you can read on it. Here is a recent one:

http://support.sas.com/resources/papers/proceedings14/1619-2014.pdf

It eliminates the need for the RETAIN and the CALL MISSING().

%let max=5 ;

data want ;

  do i=1 by 1 until (last.subj);

    set have ;

     by subj ;

     array s (&max);

     s(i) = score;

   end;

run;

You might want to add a step to pre-calculate the maximum size to use for your array. 

proc sql noprint ;

   select max(count) into :max separated by ' '

    from (select subj,count(*) as count from have group by subj)

  ;

quit;

Super Contributor
Posts: 308

Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

Hello,

Your code gives 15 observation in the output because you have the code:

if last.Subj then output;

inside the do loop therefore when the condition is true it writes 5 observations to the output , one for every step of do loop.

A number of solutions have been presented here therefore you have a choice Smiley Happy

Contributor
Posts: 68

Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

Hi loko, can you please explain me why it has both 4 and 5 for Subj001, while it only has 6 for Subj002? I would greatly appreciate your help.

Subj    Score    S1    S2    S3    S4    S5    i

001     4    4   5    5    5    5    1

001    4    4    4   5    5    5    2

001    4    4    4    4   5    5    3

001    4    4    4    4    4   5    4

001    4    4    4    4    4    4    5

002    6    6    6    6    6    6    1

002    6    6    6    6    6    6    2

002   6    6    6    6    6    6    3

002   6    6    6    6    6    6    4

002   6    6    6    6    6    6    5

SAS Super FREQ
Posts: 708

Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

Hi

You can use the DATA Step loop to fill the array elements. To make it more dynamic you can find out first how many array elements you need. See sample code below:

data have;
  infile cards;
 
input
    Subj :
8.
    Score :
8.
  ;
  cards;
001 7
001 6
001 5
001 5
001 4
002 8
002 7
002 6
002 6
;

proc sql noprint;
 
select
    max( maxN )
into :maxN trimmed
 
from (
   
select
      subj
      , count(*)
as maxN
   
from
      have
   
group by
      subj
      )
  ;
quit;
%put NOTE: &=maxN;

data want;
  set have;
  by Subj;
  array S{&maxN};
  retain S1 - S&maxN;

 
if first.Subj then do;
   
call missing(of S{*});
    i = 0;
 
end;

  i +
1;
  S{i} = Score;

 
if last.Subj then do;
   
output;
 
end;

 
keep subj S1 - S&maxN;
run;
Super User
Posts: 10,020

Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

data have;
  infile cards; 
  input
    Subj : 8.
    Score : 8.
  ;
  cards; 
001 7
001 6
001 5
001 5
001 4
002 8
002 7
002 6
002 6
; 
run;
data have;
 set have;
 by subj;
 if first.subj then n=0;
 n+1;
 run;
proc sort data=have(keep=n) out=key nodupkey;by n;run;
data _null_;
 set key end=last;
 if _n_ eq 1 then call execute('data want;merge ');
 call execute(cats('have(rename=(Score=Score',n,') where=(n=',n,') )'));
 if last then call execute(';by Subj;drop n;run;');
run;

Xia Keshan

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 364 views
  • 7 likes
  • 7 in conversation