Hi,
I am trying to convert one column ob observations into multiple variables, the original data looks something like:
A | B | C | D | E | F | G |
---|---|---|---|---|---|---|
X | X | X | X | X | 1 | X |
X | X | X | X | X | 2 | X |
X | X | X | X | X | 3 | X |
... | ... | ... | ... | ... | ... | ... |
X | X | X | X | X | 99 | X |
and I am trying to convert into something like:
A | B | C | D | E | F1 | F2 | F3 | ... | F99 |
---|---|---|---|---|---|---|---|---|---|
X | X | X | X | X | G | G | G | ... | G |
X | X | X | X | X | G | G | G | ... | G |
X | X | X | X | X | G | G | G | ... | G |
I understand that the proc transpose could achieve such transform, however the problems are:
1. Got ERROR: the ID value "XXX" occurs twice in the input data set. (If use LET, SAS will delete duplicate data which I do not want)
2. In the original data, the value of variable F are integers 1 to 99, however there might not have all these 99 numbers so after the transpose the variables could become something like F30 to F88 (less than 99 F variables).
Hope my description has enough detail and understandable.
Thanks,
Eric
Is the example helpful:
data have;
input id sex $ age;
cards;
1 f 1
1 f 2
1 f 3
1 f 4
1 f 5
1 f 6
2 m 1
2 m 2
2 m 3
;
data want(drop=age);
retain id sex age1-age6;
array _var(*) age1-age6;
set have;
by id;
_var(age)=age;
if last.id then do; output; call missing(of _var(*));end;
proc print;run;
or
data have2;
input id sex $ age weight;
cards;
1 f 1 20
1 f 2 21
1 f 3 22
1 f 4 23
1 f 5 25
1 f 6 26
2 m 1 18
2 m 2 19
2 m 3 40
;
data want2(drop=age weight);
retain id sex age1-age6;
array _var(*) age1-age6;
set have2;
by id;
_var(age)=weight;
if last.id then do; output; call missing(of _var(*));end;
proc print;run;
Is the example helpful:
data have;
input id sex $ age;
cards;
1 f 1
1 f 2
1 f 3
1 f 4
1 f 5
1 f 6
2 m 1
2 m 2
2 m 3
;
data want(drop=age);
retain id sex age1-age6;
array _var(*) age1-age6;
set have;
by id;
_var(age)=age;
if last.id then do; output; call missing(of _var(*));end;
proc print;run;
or
data have2;
input id sex $ age weight;
cards;
1 f 1 20
1 f 2 21
1 f 3 22
1 f 4 23
1 f 5 25
1 f 6 26
2 m 1 18
2 m 2 19
2 m 3 40
;
data want2(drop=age weight);
retain id sex age1-age6;
array _var(*) age1-age6;
set have2;
by id;
_var(age)=weight;
if last.id then do; output; call missing(of _var(*));end;
proc print;run;
I'm not sure I understand what you are trying to do. It appears like you want to put all up to 99 instances of variable G onto one line. However, that assumes that variables a thru e have the same values on all records.
It would help if you provided the code that you tried and a bit more of an explanation.
Subject to all the questions that you have seen about your data and objectives, this might be what you are looking for:
data want;
array fvalues {99} f1-f99;
do until (last.e);
set have;
by a b c d e;
fvalues{f} = g;
end;
drop f g;
run;
This will not detect how many times the same F appears within a grouping of A/B/C/D/E. If there are duplicates, it will merely replace the old value with the new. If you have duplicates, but want to save both of them, the data structure that you described is not capable of holding all the data.
Good luck.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.