Hey everyone - I have a dataset where the variables are situated in rows, however, I would like to flip them vertical in order to facilitate the other manipulations I need to do. For example:
Current Structure
X X X X Y Y Y Y Z Z Z Z
I want =
X Y Z
X Y Z
X Y Z
X Y Z
Im using this code:
data all.ap5;
set all.stage9;
array vars{900} VAR11_c1-VAR11_c900;
array vars1{900} VAR25_c1-VAR25_c900;
array vars2{900} VAR34_c1-VAR34_c900;
obs = _n_;
do i = 1 to 900;
var11 = vars{i};
if not missing(var11) then output all.ap5;
end;
do i = 1 to 900;
var25 = vars1{i};
if not missing(var25) then output all.ap5;
end;
do i = 1 to 900;
var34 = vars2{i};
if not missing(var34) then output all.ap5;
end;
drop i var8_c1-var8_c1260 VAR11_c1-VAR11_c1260 VAR12_c1-VAR12_c1260 Participant__Nation_c1-Participant__Nation_c1260
var25_c1-var25_c1260 var26_c1-var26_c1260 VAR32_c1-VAR32_c1260 VAR34_c1-VAR34_c1260 VAR35_c1-VAR35_c1260
VAR36_c1-VAR36_c1260 VAR37_c1-VAR37_c1260 VAR50_c1-VAR50_c1260 VAR54_c1-VAR54_c1260 VAR55_c1-VAR55_c1260
VAR61_c1-VAR61_c1260;
run;
What it gives me is:
X
X
X
X
Y
Y
Y
Y
Z
Z
Z
Z
How do I edit the code to get what I need?
Thanks in advance
HAVE
Middle Observation(15 ) of Last dataset = WORK.HAVE - Total Obs 31
1260*3 = 3780 variables
-- NUMERIC -- VALUE
VAR11_C1 N 8 50
VAR11_C2 N 8 846
VAR11_C3 N 8 671
VAR11_C4 N 8 832
VAR11_C5 N 8 598
VAR11_C6 N 8 96
VAR11_C7 N 8 883
VAR11_C8 N 8 553
VAR11_C9 N 8 679
....
VAR11_C1260 N 8 294
VAR25_C1 N 8 105
VAR25_C2 N 8 934
VAR25_C3 N 8 783
VAR25_C4 N 8 876
'''
VAR25_C1259 N 8 774
VAR25_C1260 N 8 409
VAR34_C1 N 8 408
VAR34_C2 N 8 468
VAR34_C3 N 8 16
VAR34_C4 N 8 455
...
VAR34_C1259 N 8 827
VAR34_C1260 N 8 83
WANT
p to 40 obs WORK.WANT total obs=39,060
Obs REC VAR11 VAR25 VAR34
1 1 20 578 393
2 1 613 704 928
3 1 813 125 369
4 1 492 514 301
5 1 802 34 703
6 1 218 338 986
7 1 839 114 283
8 1 549 73 89
9 1 748 620 457
WORKING CODE
data want1(keep=rec var11) want2(keep=var25) want3(keep=var34);
merge want1 want2 want3;
* no by;
FULL SOLUTION
=============
libname sd1 "d:/sd1";
libname wrk "%sysfunc(pathname(work))";
data have;
* create some data;
array vars{1260} VAR11_c1-VAR11_c1260;
array vars1{1260} VAR25_c1-VAR25_c1260;
array vars2{1260} VAR34_c1-VAR34_c1260;
do rec=1 to 30;
do i=1 to 1260;
vars[i]=int(1000*uniform(5731));
vars1[i]=int(1000*uniform(5731));
vars2[i]=int(1000*uniform(5731));
end;
drop i rec;
output;
end;
output;
run;quit;
data want1(keep=rec var11) want2(keep=var25) want3(keep=var34);
retain rec;
set have;
array vars{1260} VAR11_c1-VAR11_c1260;
array vars1{1260} VAR25_c1-VAR25_c1260;
array vars2{1260} VAR34_c1-VAR34_c1260;
rec = _n_;
do i = 1 to 1260;
var11 = vars{i};
output want1;
end;
do i = 1 to 1260;
var25 = vars1{i};
output want2;
end;
do i = 1 to 1260;
var34 = vars2{i};
output want3;
end;
drop i
VAR11_C1-VAR11_C1260
VAR25_C1-VAR25_C1260
VAR34_C1-VAR34_C1260;
run;
data wrk.want;
merge want1 want2 want3;
* no by;
if sum(var11,var25,var34)=0 then delete;
run;quit;
proc print data=want;
run;quit;
Obs REC VAR11 VAR25 VAR34
1 1 20 578 393
2 1 613 704 928
3 1 813 125 369
4 1 492 514 301
5 1 802 34 703
6 1 218 338 986
7 1 839 114 283
8 1 549 73 89
9 1 748 620 457
I think you were pretty close if I understand your goal. Look at this:
data all.ap5; set all.stage9; array vars{900} VAR11_c1-VAR11_c900; array vars1{900} VAR25_c1-VAR25_c900; array vars2{900} VAR34_c1-VAR34_c900; obs = _n_; do i = 1 to 900; var11 = vars{i}; var25 = vars1{i}; var34 = vars2{i}; output all.ap5; end; drop i var8_c1-var8_c1260 VAR11_c1-VAR11_c1260 VAR12_c1-VAR12_c1260 Participant__Nation_c1-Participant__Nation_c1260 var25_c1-var25_c1260 var26_c1-var26_c1260 VAR32_c1-VAR32_c1260 VAR34_c1-VAR34_c1260 VAR35_c1-VAR35_c1260 VAR36_c1-VAR36_c1260 VAR37_c1-VAR37_c1260 VAR50_c1-VAR50_c1260 VAR54_c1-VAR54_c1260 VAR55_c1-VAR55_c1260 VAR61_c1-VAR61_c1260; run;
Note that it is a good idea to post code and log results in a code box opened with the {i} menu icon to prevent the message window from reformatting when pasted here.
The numbers of variables in your data set is somewhat troubling to me. That is a LOT of variables.
HAVE
Middle Observation(15 ) of Last dataset = WORK.HAVE - Total Obs 31
1260*3 = 3780 variables
-- NUMERIC -- VALUE
VAR11_C1 N 8 50
VAR11_C2 N 8 846
VAR11_C3 N 8 671
VAR11_C4 N 8 832
VAR11_C5 N 8 598
VAR11_C6 N 8 96
VAR11_C7 N 8 883
VAR11_C8 N 8 553
VAR11_C9 N 8 679
....
VAR11_C1260 N 8 294
VAR25_C1 N 8 105
VAR25_C2 N 8 934
VAR25_C3 N 8 783
VAR25_C4 N 8 876
'''
VAR25_C1259 N 8 774
VAR25_C1260 N 8 409
VAR34_C1 N 8 408
VAR34_C2 N 8 468
VAR34_C3 N 8 16
VAR34_C4 N 8 455
...
VAR34_C1259 N 8 827
VAR34_C1260 N 8 83
WANT
p to 40 obs WORK.WANT total obs=39,060
Obs REC VAR11 VAR25 VAR34
1 1 20 578 393
2 1 613 704 928
3 1 813 125 369
4 1 492 514 301
5 1 802 34 703
6 1 218 338 986
7 1 839 114 283
8 1 549 73 89
9 1 748 620 457
WORKING CODE
data want1(keep=rec var11) want2(keep=var25) want3(keep=var34);
merge want1 want2 want3;
* no by;
FULL SOLUTION
=============
libname sd1 "d:/sd1";
libname wrk "%sysfunc(pathname(work))";
data have;
* create some data;
array vars{1260} VAR11_c1-VAR11_c1260;
array vars1{1260} VAR25_c1-VAR25_c1260;
array vars2{1260} VAR34_c1-VAR34_c1260;
do rec=1 to 30;
do i=1 to 1260;
vars[i]=int(1000*uniform(5731));
vars1[i]=int(1000*uniform(5731));
vars2[i]=int(1000*uniform(5731));
end;
drop i rec;
output;
end;
output;
run;quit;
data want1(keep=rec var11) want2(keep=var25) want3(keep=var34);
retain rec;
set have;
array vars{1260} VAR11_c1-VAR11_c1260;
array vars1{1260} VAR25_c1-VAR25_c1260;
array vars2{1260} VAR34_c1-VAR34_c1260;
rec = _n_;
do i = 1 to 1260;
var11 = vars{i};
output want1;
end;
do i = 1 to 1260;
var25 = vars1{i};
output want2;
end;
do i = 1 to 1260;
var34 = vars2{i};
output want3;
end;
drop i
VAR11_C1-VAR11_C1260
VAR25_C1-VAR25_C1260
VAR34_C1-VAR34_C1260;
run;
data wrk.want;
merge want1 want2 want3;
* no by;
if sum(var11,var25,var34)=0 then delete;
run;quit;
proc print data=want;
run;quit;
Obs REC VAR11 VAR25 VAR34
1 1 20 578 393
2 1 613 704 928
3 1 813 125 369
4 1 492 514 301
5 1 802 34 703
6 1 218 338 986
7 1 839 114 283
8 1 549 73 89
9 1 748 620 457
Finally figured it out. Thanks guys - I used rogerjdean's code and added 'if not missing(var) then' before each output. Thanks for your patience.
Ballard,
data all.ap5; set all.stage9; array vars{900} VAR11_c1-VAR11_c900; array vars1{900} VAR25_c1-VAR25_c900; array vars2{900} VAR34_c1-VAR34_c900; obs = _n_; do i = 1 to 900; var11 = vars{i}; var25 = vars1{i}; var34 = vars2{i}; output all.ap5; end; drop i var8_c1-var8_c1260 VAR11_c1-VAR11_c1260 VAR12_c1-VAR12_c1260 Participant__Nation_c1-Participant__Nation_c1260 var25_c1-var25_c1260 var26_c1-var26_c1260 VAR32_c1-VAR32_c1260 VAR34_c1-VAR34_c1260 VAR35_c1-VAR35_c1260 VAR36_c1-VAR36_c1260 VAR37_c1-VAR37_c1260 VAR50_c1-VAR50_c1260 VAR54_c1-VAR54_c1260 VAR55_c1-VAR55_c1260 VAR61_c1-VAR61_c1260; run;
Thanks for this. However, the 'if not missing' part is critical to my code - is there a way to incorporate this in the code you offered?
I don't know your logic so you'll have to incorporate but if you require ALL of the variables to be not missing:
if nmiss(var11,var25,var34) = 0 then output;
may be one way.
Previously you were looking at individual values so
And then there's always good old PROC TRANSPOSE:
proc transpose data=work.have out=work.Inter1;
var VAR11_c1-VAR11_c900 VAR25_c1-VAR25_c900 VAR34_c1-VAR34_c900;
run;
data Inter1(drop=_NAME_);
length ColClass $10;
set Inter1;
ColClass = substr(_NAME_, 1, 7);
run;
proc sort data=work.Inter1;
by ColClass;
run;
proc transpose data=work.Inter1 out=Inter2;
by ColClass;
var Col1;
run;
proc transpose data=work.Inter2(drop=_NAME_) out=Want;
run;
It is very easy for IML code.
data have;
input (v1-v12) ($);
cards;
X X X X Y Y Y Y Z Z Z Z
;
run;
proc iml;
use have;
read all var _all_ into x;
close;
want=shapecol(x,0,3);
create want from want;
append from want;
close;
quit;
Thanks guys - I've tried various versions of the code everyone has given me, but it's still not quite what I need. Perhaps it's because I'm not very advanced with SAS, but let me reiterate what I'm looking for because I must've not explained well. My data structure is like this
X . . . X X Y . Y Y . . Z . . Z Z
I need it to be like this
X Y Z
X Y Z
X Y Z
when I use this code:
data all.ap5; set all.stage9; array vars{900} VAR11_c1-VAR11_c900; obs = _n_; do i = 1 to 900; var11 = vars{i}; if not missing(var11) then output all.ap5; end; drop i var8_c1-var8_c1260 VAR11_c1-VAR11_c1260 VAR12_c1-VAR12_c1260 Participant__Nation_c1-Participant__Nation_c1260 var25_c1-var25_c1260 var26_c1-var26_c1260 VAR32_c1-VAR32_c1260 VAR34_c1-VAR34_c1260 VAR35_c1-VAR35_c1260 VAR36_c1-VAR36_c1260 VAR37_c1-VAR37_c1260 VAR50_c1-VAR50_c1260 VAR54_c1-VAR54_c1260 VAR55_c1-VAR55_c1260 VAR61_c1-VAR61_c1260; run;
It gives me exactly what I need. However, instead of doing it for just one variable. I want to do it for the three variables at the same time, so I tried something like:
data all.test; set all.stage8; array vars{1440} VAR55_c1-VAR55_c1440; array vars1{1440} VAR61_c1-VAR61_c1440; obs = _n_; do i = 1 to 1440; var55 = vars{i}; if not missing(var55) then output all.test; end; do i = 1 to 1440; var61 = vars1{i}; if not missing(var61) then output all.test; end; drop i VAR55_c1-VAR55_c1440 VAR61_c1-VAR61_c1440; run;
However, that does not work how I need it to..
If someone could point me in the right direction that would be great.
The variables I'm arraying are character variables with either a "Yes" or "No" as their values.
Thank you,
R
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.