I have a dataset that looks like this:
data have;
input block type $ count;
datalines;
2 A 10
6 B 15
6 C 5
8 D 25
8 E 32
9 E 12
;
run;
and i want to transform it like this:
data want;
infile datalines missover;
input block type $ count type2 $ count2;
datalines;
2 A 10
6 B 15 C 5
8 D 25 E 32
9 E 12
;
run;
In actuality my data has about 6M rows and can have up to 7 'types' for a given block.
Hi:
Another alternative, if you're going to have a DATA step and a PROC TRANSPOSE is to skip making 2 passes through the data (especially, if it's large data) and just have 1 DATA step using ARRAYs to transpose the data. If there is a possibility of a max of 7 for a block, that makes it easy to define the size of the ARRAY and this approach only makes 1 pass through the data.
This does assume, however, that the data are sorted by BLOCK.
Cynthia
options missing=.;
data want(keep=block bcnt typ1 cnt1 typ2 cnt2 typ3 cnt3 typ4 cnt4
typ5 cnt5 typ6 cnt6 typ7 cnt7);
set have;
by block;
array typ (7) $ typ1-typ7;
array cnt(7) cnt1-cnt7;
retain bcnt typ1-typ7 cnt1-cnt7;
if first.block then do;
call missing(of typ1-typ7);
call missing(of cnt1-cnt7);
bcnt=0;
end;
bcnt+1;
typ(bcnt) = type;
cnt(bcnt) = count;
if last.block then output;
run;
title; footnote;
proc print data=want;
var block bcnt typ1 cnt1 typ2 cnt2 typ3 cnt3 typ4 cnt4
typ5 cnt5 typ6 cnt6 typ7 cnt7;
run;
data have;
input block type $ count;
datalines;
2 A 10
6 B 15
6 C 5
8 D 25
8 E 32
9 E 12
;
run;
data _h;
do n=1 by 1 until(last.block);
set have;
by block;
vn=cats(vname(type),n);
v=type;
output;
vn=cats(vname(count),n);
v=put(count,8. -l);
output;
end;
keep block vn v;
run;
proc transpose data=_h out=want(drop=_name_);
by block;
id vn;
var v ;
run;
Hi:
Another alternative, if you're going to have a DATA step and a PROC TRANSPOSE is to skip making 2 passes through the data (especially, if it's large data) and just have 1 DATA step using ARRAYs to transpose the data. If there is a possibility of a max of 7 for a block, that makes it easy to define the size of the ARRAY and this approach only makes 1 pass through the data.
This does assume, however, that the data are sorted by BLOCK.
Cynthia
options missing=.;
data want(keep=block bcnt typ1 cnt1 typ2 cnt2 typ3 cnt3 typ4 cnt4
typ5 cnt5 typ6 cnt6 typ7 cnt7);
set have;
by block;
array typ (7) $ typ1-typ7;
array cnt(7) cnt1-cnt7;
retain bcnt typ1-typ7 cnt1-cnt7;
if first.block then do;
call missing(of typ1-typ7);
call missing(of cnt1-cnt7);
bcnt=0;
end;
bcnt+1;
typ(bcnt) = type;
cnt(bcnt) = count;
if last.block then output;
run;
title; footnote;
proc print data=want;
var block bcnt typ1 cnt1 typ2 cnt2 typ3 cnt3 typ4 cnt4
typ5 cnt5 typ6 cnt6 typ7 cnt7;
run;
Try the %TRANSPOSE macro, written by members of this community
http://support.sas.com/resources/papers/proceedings13/538-2013.pdf
Try this:
proc sort data=have;
by block type;
run;
proc transpose data=have out=count(drop=_name_) prefix=count;
by block;
var count;
run;
proc transpose data=have out=type(drop=_name_) prefix=type;
by block;
var type;
run;
data want;
merge type count;
by block;
run;
Do you really want that structure? Your example makes the TYPE values look like names. Perhaps it would be better to treat them as such by using TYPE in an ID statement?
proc transpose data=have out=want2 ;
by block;
id type;
var count;
run;
Obs block _NAME_ A B C D E 1 2 count 10 . . . . 2 6 count . 15 5 . . 3 8 count . . . 25 32 4 9 count . . . . 12
Thank you everyone for your solutions! I truly appreciate the helpful responses I always get from this community 🙂
This one is good example for using PROC SUMMARY. Does not require sorting and will transpose both numeric and character at the same time without conversion. You do have to know the max number of obs per CLASS group width maximum 100.
data have;
input block type $ count;
datalines;
2 A 10
8 D 25
8 E 32
9 E 12
6 B 15
6 C 5
;
run;
proc print;
run;
proc summary nway;
class block;
output out=wide(drop=_type_) idgroup(out[7](type count)=);
run;
proc print;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.