Hi I have dataset:
year | name | one_str_a | one_str_b | one_str_c | one_con_a | one_con_b | one_con_c | two_str_a | two_str_b | two_str_c | two_con_a | two_con_b | two_con_c |
1991 | john | 2 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | 2 | 1 |
1991 | amy | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
1991 | ted | 2 | 0 | 1 | 0 | 1 | 2 | 1 | 0 | 1 | 1 | 2 | 1 |
1991 | tom | 1 | 2 | 0 | 2 | 0 | 1 | 0 | 2 | 0 | 0 | 1 | 0 |
1992 | john | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
1992 | amy | 0 | 2 | 2 | 2 | 2 | 0 | 2 | 2 | 2 | 2 | 0 | 2 |
1992 | ted | 2 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | 2 | 1 |
1992 | tom | 1 | 2 | 2 | 2 | 2 | 1 | 2 | 2 | 2 | 2 | 1 | 2 |
my years are from 1991 to 2012 and i have more observations than the above but i just wanted to give you a picture. my scores range from 0~2.
I want it in this form:
varname | Score | ||
1991 | john | one_str_a | 2 |
1991 | john | one_str_b | 1 |
1991 | john | one_str_c | 1 |
1991 | john | one_con_a | 1 |
1991 | john | one_con_b | 1 |
1991 | john | one_con_c | 2 |
1991 | john | two_str_a | 1 |
1991 | john | two_str_b | 1 |
1991 | john | two_str_c | 1 |
1991 | john | two_con_a | 1 |
1991 | john | two_con_b | 2 |
1991 | john | two_con_c | 1 |
Could you kindly provde me of your guidance?
Hope we can transpose this one normally by proc transpose.
proc transpose data=dataset out=final;
var one_str_a one_str_b one_str_c one_con_a one_one_b one_one_c two_str_a two_str_b two_str_c etc...;
by year name;
run;
Sudeer
hi sudeer, thank you much. can u shed some light on how to apply array to this as well? thanks again!
Hi,
Why would you want to use arrays? I have seen that people use them when going from normalized data to transposed data to have more control over the output, but arrays don't make much sense when going from transposed to normalized data. If you really want to do it another way then the below code is a suggestion:
proc sql;
create table WORK.LOOP as
select distinct NAME,TYPE
from SASHELP.VCOLUMN
where LIBNAME="SASHELP"
and MEMNAME="CLASS"
and NAME ne "Name";
quit;
data _null_;
set work.loop end=last;
if _n_=1 then call execute('data want (keep=name section result);
set sashelp.class;
attrib section result format=$200.;');
if type="num" then call execute(' section="'||strip(name)||'"; result=strip(put('||strip(name)||',best.)); output;');
if type="char" then call execute(' section="'||strip(name)||'"; result='||strip(name)||'; output;');
if last then call execute('run;');
run;
You can use arrays to normalize this type of data. It is much easier in SAS if you use numeric suffixes for your blocks of variables.
To make this program easier I just assumed that you wanted to transpose all of the numeric variables except for YEAR.
data have ;
input year name $
one_str_a one_str_b one_str_c one_con_a one_con_b one_con_c
two_str_a two_str_b two_str_c two_con_a two_con_b two_con_c
;
cards;
1991 john 2 1 1 1 1 2 1 1 1 1 2 1
1991 amy 1 0 0 0 0 1 0 0 0 0 1 0
run;
data want ;
set have ;
length varname $32 score 8 ;
keep year name varname score ;
array scores _numeric_ ;
do i=1 to dim(scores);
score = scores(i);
varname = vname(scores(i));
if upcase(varname) ne 'YEAR' then output;
end;
run;
Hi tom above code works fine except we need one addition, as you have created score variable, i think if statement should include 'SCORE' as score is also transposed, which results in extra observations.
if upcase(varname) not in ('YEAR' 'SCORE') then output;
Sudeer
Or just move the ARRAY statement to before the LENGTH statement that defines SCORE.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.