Solved
Contributor
Posts: 68

# Using DATA STEP to convert several obs per subject to one obs with several subjects

Hi guys, I just don't use proc transpose but using data steps, how to convert the data have -->data want?

Data have:

Obs Subj  Score

1      001  7

2      001  6

3      001  5

4      001  5

5      001  4

6      002  8

7      002  7

8      002  6

9      002  6

10    002  6

Data want:

Obs Subj S1 S2 S3 S4 S5

1      001 7 6 5 5 4

2      002 8 7 6 6 6

3      003 8 7 6 6 5

I wrote the following codes:

proc sort data=one;

by Subj;

run;

data want;

set have;

by Subj;

array S{5};

do i=1 to 5;

retain S1-S5;

if first.Subj then call missing(of S1-S5);

S{i}=Score;

if last.Subj then output;

end;

run;

It comes out 3*5=15 observations, I guess the problem is raised by do loop, but how can I justify it?

Thank you so much.

Chouchou

Accepted Solutions
Solution
‎10-10-2014 01:01 AM
Super User
Posts: 8,120

## Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

If you just logically step though your program you will see the logic error.

When it reads the first observation from HAVE it will execute the DO loop 5 times and copy the same value into all five variables in the array.

Then when it reads the next observation from HAVE it will do the same thing.

One simple pattern that can help with this is what is called a DOW loop. There are a number of papers you can read on it. Here is a recent one:

http://support.sas.com/resources/papers/proceedings14/1619-2014.pdf

It eliminates the need for the RETAIN and the CALL MISSING().

%let max=5 ;

data want ;

do i=1 by 1 until (last.subj);

set have ;

by subj ;

array s (&max);

s(i) = score;

end;

run;

You might want to add a step to pre-calculate the maximum size to use for your array.

proc sql noprint ;

select max(count) into :max separated by ' '

from (select subj,count(*) as count from have group by subj)

;

quit;

All Replies
Posts: 1,270

## Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

proc sort data=have;
by Subj;
run;

data want;
do _n_=1 to 5;
set have;
by Subj;
array S{5};
S{_n_}=Score;
end;
run;

Contributor
Posts: 68

## Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

The result is

Subj Score S1 S2 S3 S4 S5

001     4     7   6    5   5    4

just one obs, not two

Posts: 1,270

## Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

Not sure, but I am getting two obs

Data have;
input Obs Subj  Score;
datalines;
1      001  7
2      001  6
3      001  5
4      001  5
5      001  4
6      002  8
7      002  7
8      002  6
9      002  6
10     002  6
;

proc sort data=have;
by Subj;
run;

data want;
do _n_=1 to 5;
set have;
by Subj;
array S{5};
S{_n_}=Score;
end;
run;

Contributor
Posts: 68

## Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

It's pretty weird. I just copied your code and it resolved to one obs.

:smileyconfused:

Super User
Posts: 23,771

## Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

Remove the do loop entirely but add another counter so you know what record your on for each id.

Solution
‎10-10-2014 01:01 AM
Super User
Posts: 8,120

## Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

If you just logically step though your program you will see the logic error.

When it reads the first observation from HAVE it will execute the DO loop 5 times and copy the same value into all five variables in the array.

Then when it reads the next observation from HAVE it will do the same thing.

One simple pattern that can help with this is what is called a DOW loop. There are a number of papers you can read on it. Here is a recent one:

http://support.sas.com/resources/papers/proceedings14/1619-2014.pdf

It eliminates the need for the RETAIN and the CALL MISSING().

%let max=5 ;

data want ;

do i=1 by 1 until (last.subj);

set have ;

by subj ;

array s (&max);

s(i) = score;

end;

run;

You might want to add a step to pre-calculate the maximum size to use for your array.

proc sql noprint ;

select max(count) into :max separated by ' '

from (select subj,count(*) as count from have group by subj)

;

quit;

Super Contributor
Posts: 319

## Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

Hello,

Your code gives 15 observation in the output because you have the code:

if last.Subj then output;

inside the do loop therefore when the condition is true it writes 5 observations to the output , one for every step of do loop.

A number of solutions have been presented here therefore you have a choice

Contributor
Posts: 68

## Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

Hi loko, can you please explain me why it has both 4 and 5 for Subj001, while it only has 6 for Subj002? I would greatly appreciate your help.

Subj    Score    S1    S2    S3    S4    S5    i

001     4    4   5    5    5    5    1

001    4    4    4   5    5    5    2

001    4    4    4    4   5    5    3

001    4    4    4    4    4   5    4

001    4    4    4    4    4    4    5

002    6    6    6    6    6    6    1

002    6    6    6    6    6    6    2

002   6    6    6    6    6    6    3

002   6    6    6    6    6    6    4

002   6    6    6    6    6    6    5

SAS Super FREQ
Posts: 825

## Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

Hi

You can use the DATA Step loop to fill the array elements. To make it more dynamic you can find out first how many array elements you need. See sample code below:

data have;
infile cards;

input
Subj :
8.
Score :
8.
;
cards;
001 7
001 6
001 5
001 5
001 4
002 8
002 7
002 6
002 6
;

proc sql noprint;

select
max( maxN )
into :maxN trimmed

from (

select
subj
, count(*)
as maxN

from
have

group by
subj
)
;
quit;
%put NOTE: &=maxN;

data want;
set have;
by Subj;
array S{&maxN};
retain S1 - S&maxN;

if first.Subj then do;

call missing(of S{*});
i = 0;

end;

i +
1;
S{i} = Score;

if last.Subj then do;

output;

end;

keep subj S1 - S&maxN;
run;
Super User
Posts: 10,787

## Re: Using DATA STEP to convert several obs per subject to one obs with several subjects

```data have;
infile cards;
input
Subj : 8.
Score : 8.
;
cards;
001 7
001 6
001 5
001 5
001 4
002 8
002 7
002 6
002 6
;
run;
data have;
set have;
by subj;
if first.subj then n=0;
n+1;
run;
proc sort data=have(keep=n) out=key nodupkey;by n;run;
data _null_;
set key end=last;
if _n_ eq 1 then call execute('data want;merge ');
call execute(cats('have(rename=(Score=Score',n,') where=(n=',n,') )'));
if last then call execute(';by Subj;drop n;run;');
run;

```

Xia Keshan

🔒 This topic is solved and locked.

Discussion stats
• 10 replies
• 461 views
• 7 likes
• 7 in conversation