Hey everyone - I have a dataset where the variables are situated in rows, however, I would like to flip them vertical in order to facilitate the other manipulations I need to do. For example:
Current Structure
X X X X Y Y Y Y Z Z Z Z
I want =
X Y Z
X Y Z
X Y Z
X Y Z
Im using this code:
data all.ap5;
set all.stage9;
array vars{900} VAR11_c1-VAR11_c900;
array vars1{900} VAR25_c1-VAR25_c900;
array vars2{900} VAR34_c1-VAR34_c900;
obs = _n_;
do i = 1 to 900;
var11 = vars{i};
if not missing(var11) then output all.ap5;
end;
do i = 1 to 900;
var25 = vars1{i};
if not missing(var25) then output all.ap5;
end;
do i = 1 to 900;
var34 = vars2{i};
if not missing(var34) then output all.ap5;
end;
drop i var8_c1-var8_c1260 VAR11_c1-VAR11_c1260 VAR12_c1-VAR12_c1260 Participant__Nation_c1-Participant__Nation_c1260
var25_c1-var25_c1260 var26_c1-var26_c1260 VAR32_c1-VAR32_c1260 VAR34_c1-VAR34_c1260 VAR35_c1-VAR35_c1260
VAR36_c1-VAR36_c1260 VAR37_c1-VAR37_c1260 VAR50_c1-VAR50_c1260 VAR54_c1-VAR54_c1260 VAR55_c1-VAR55_c1260
VAR61_c1-VAR61_c1260;
run;
What it gives me is:
X
X
X
X
Y
Y
Y
Y
Z
Z
Z
Z
How do I edit the code to get what I need?
Thanks in advance
HAVE
Middle Observation(15 ) of Last dataset = WORK.HAVE - Total Obs 31
1260*3 = 3780 variables
-- NUMERIC -- VALUE
VAR11_C1 N 8 50
VAR11_C2 N 8 846
VAR11_C3 N 8 671
VAR11_C4 N 8 832
VAR11_C5 N 8 598
VAR11_C6 N 8 96
VAR11_C7 N 8 883
VAR11_C8 N 8 553
VAR11_C9 N 8 679
....
VAR11_C1260 N 8 294
VAR25_C1 N 8 105
VAR25_C2 N 8 934
VAR25_C3 N 8 783
VAR25_C4 N 8 876
'''
VAR25_C1259 N 8 774
VAR25_C1260 N 8 409
VAR34_C1 N 8 408
VAR34_C2 N 8 468
VAR34_C3 N 8 16
VAR34_C4 N 8 455
...
VAR34_C1259 N 8 827
VAR34_C1260 N 8 83
WANT
p to 40 obs WORK.WANT total obs=39,060
Obs REC VAR11 VAR25 VAR34
1 1 20 578 393
2 1 613 704 928
3 1 813 125 369
4 1 492 514 301
5 1 802 34 703
6 1 218 338 986
7 1 839 114 283
8 1 549 73 89
9 1 748 620 457
WORKING CODE
data want1(keep=rec var11) want2(keep=var25) want3(keep=var34);
merge want1 want2 want3;
* no by;
FULL SOLUTION
=============
libname sd1 "d:/sd1";
libname wrk "%sysfunc(pathname(work))";
data have;
* create some data;
array vars{1260} VAR11_c1-VAR11_c1260;
array vars1{1260} VAR25_c1-VAR25_c1260;
array vars2{1260} VAR34_c1-VAR34_c1260;
do rec=1 to 30;
do i=1 to 1260;
vars[i]=int(1000*uniform(5731));
vars1[i]=int(1000*uniform(5731));
vars2[i]=int(1000*uniform(5731));
end;
drop i rec;
output;
end;
output;
run;quit;
data want1(keep=rec var11) want2(keep=var25) want3(keep=var34);
retain rec;
set have;
array vars{1260} VAR11_c1-VAR11_c1260;
array vars1{1260} VAR25_c1-VAR25_c1260;
array vars2{1260} VAR34_c1-VAR34_c1260;
rec = _n_;
do i = 1 to 1260;
var11 = vars{i};
output want1;
end;
do i = 1 to 1260;
var25 = vars1{i};
output want2;
end;
do i = 1 to 1260;
var34 = vars2{i};
output want3;
end;
drop i
VAR11_C1-VAR11_C1260
VAR25_C1-VAR25_C1260
VAR34_C1-VAR34_C1260;
run;
data wrk.want;
merge want1 want2 want3;
* no by;
if sum(var11,var25,var34)=0 then delete;
run;quit;
proc print data=want;
run;quit;
Obs REC VAR11 VAR25 VAR34
1 1 20 578 393
2 1 613 704 928
3 1 813 125 369
4 1 492 514 301
5 1 802 34 703
6 1 218 338 986
7 1 839 114 283
8 1 549 73 89
9 1 748 620 457
I think you were pretty close if I understand your goal. Look at this:
data all.ap5; set all.stage9; array vars{900} VAR11_c1-VAR11_c900; array vars1{900} VAR25_c1-VAR25_c900; array vars2{900} VAR34_c1-VAR34_c900; obs = _n_; do i = 1 to 900; var11 = vars{i}; var25 = vars1{i}; var34 = vars2{i}; output all.ap5; end; drop i var8_c1-var8_c1260 VAR11_c1-VAR11_c1260 VAR12_c1-VAR12_c1260 Participant__Nation_c1-Participant__Nation_c1260 var25_c1-var25_c1260 var26_c1-var26_c1260 VAR32_c1-VAR32_c1260 VAR34_c1-VAR34_c1260 VAR35_c1-VAR35_c1260 VAR36_c1-VAR36_c1260 VAR37_c1-VAR37_c1260 VAR50_c1-VAR50_c1260 VAR54_c1-VAR54_c1260 VAR55_c1-VAR55_c1260 VAR61_c1-VAR61_c1260; run;
Note that it is a good idea to post code and log results in a code box opened with the {i} menu icon to prevent the message window from reformatting when pasted here.
The numbers of variables in your data set is somewhat troubling to me. That is a LOT of variables.
HAVE
Middle Observation(15 ) of Last dataset = WORK.HAVE - Total Obs 31
1260*3 = 3780 variables
-- NUMERIC -- VALUE
VAR11_C1 N 8 50
VAR11_C2 N 8 846
VAR11_C3 N 8 671
VAR11_C4 N 8 832
VAR11_C5 N 8 598
VAR11_C6 N 8 96
VAR11_C7 N 8 883
VAR11_C8 N 8 553
VAR11_C9 N 8 679
....
VAR11_C1260 N 8 294
VAR25_C1 N 8 105
VAR25_C2 N 8 934
VAR25_C3 N 8 783
VAR25_C4 N 8 876
'''
VAR25_C1259 N 8 774
VAR25_C1260 N 8 409
VAR34_C1 N 8 408
VAR34_C2 N 8 468
VAR34_C3 N 8 16
VAR34_C4 N 8 455
...
VAR34_C1259 N 8 827
VAR34_C1260 N 8 83
WANT
p to 40 obs WORK.WANT total obs=39,060
Obs REC VAR11 VAR25 VAR34
1 1 20 578 393
2 1 613 704 928
3 1 813 125 369
4 1 492 514 301
5 1 802 34 703
6 1 218 338 986
7 1 839 114 283
8 1 549 73 89
9 1 748 620 457
WORKING CODE
data want1(keep=rec var11) want2(keep=var25) want3(keep=var34);
merge want1 want2 want3;
* no by;
FULL SOLUTION
=============
libname sd1 "d:/sd1";
libname wrk "%sysfunc(pathname(work))";
data have;
* create some data;
array vars{1260} VAR11_c1-VAR11_c1260;
array vars1{1260} VAR25_c1-VAR25_c1260;
array vars2{1260} VAR34_c1-VAR34_c1260;
do rec=1 to 30;
do i=1 to 1260;
vars[i]=int(1000*uniform(5731));
vars1[i]=int(1000*uniform(5731));
vars2[i]=int(1000*uniform(5731));
end;
drop i rec;
output;
end;
output;
run;quit;
data want1(keep=rec var11) want2(keep=var25) want3(keep=var34);
retain rec;
set have;
array vars{1260} VAR11_c1-VAR11_c1260;
array vars1{1260} VAR25_c1-VAR25_c1260;
array vars2{1260} VAR34_c1-VAR34_c1260;
rec = _n_;
do i = 1 to 1260;
var11 = vars{i};
output want1;
end;
do i = 1 to 1260;
var25 = vars1{i};
output want2;
end;
do i = 1 to 1260;
var34 = vars2{i};
output want3;
end;
drop i
VAR11_C1-VAR11_C1260
VAR25_C1-VAR25_C1260
VAR34_C1-VAR34_C1260;
run;
data wrk.want;
merge want1 want2 want3;
* no by;
if sum(var11,var25,var34)=0 then delete;
run;quit;
proc print data=want;
run;quit;
Obs REC VAR11 VAR25 VAR34
1 1 20 578 393
2 1 613 704 928
3 1 813 125 369
4 1 492 514 301
5 1 802 34 703
6 1 218 338 986
7 1 839 114 283
8 1 549 73 89
9 1 748 620 457
Finally figured it out. Thanks guys - I used rogerjdean's code and added 'if not missing(var) then' before each output. Thanks for your patience.
Ballard,
data all.ap5; set all.stage9; array vars{900} VAR11_c1-VAR11_c900; array vars1{900} VAR25_c1-VAR25_c900; array vars2{900} VAR34_c1-VAR34_c900; obs = _n_; do i = 1 to 900; var11 = vars{i}; var25 = vars1{i}; var34 = vars2{i}; output all.ap5; end; drop i var8_c1-var8_c1260 VAR11_c1-VAR11_c1260 VAR12_c1-VAR12_c1260 Participant__Nation_c1-Participant__Nation_c1260 var25_c1-var25_c1260 var26_c1-var26_c1260 VAR32_c1-VAR32_c1260 VAR34_c1-VAR34_c1260 VAR35_c1-VAR35_c1260 VAR36_c1-VAR36_c1260 VAR37_c1-VAR37_c1260 VAR50_c1-VAR50_c1260 VAR54_c1-VAR54_c1260 VAR55_c1-VAR55_c1260 VAR61_c1-VAR61_c1260; run;
Thanks for this. However, the 'if not missing' part is critical to my code - is there a way to incorporate this in the code you offered?
I don't know your logic so you'll have to incorporate but if you require ALL of the variables to be not missing:
if nmiss(var11,var25,var34) = 0 then output;
may be one way.
Previously you were looking at individual values so
And then there's always good old PROC TRANSPOSE:
proc transpose data=work.have out=work.Inter1;
var VAR11_c1-VAR11_c900 VAR25_c1-VAR25_c900 VAR34_c1-VAR34_c900;
run;
data Inter1(drop=_NAME_);
length ColClass $10;
set Inter1;
ColClass = substr(_NAME_, 1, 7);
run;
proc sort data=work.Inter1;
by ColClass;
run;
proc transpose data=work.Inter1 out=Inter2;
by ColClass;
var Col1;
run;
proc transpose data=work.Inter2(drop=_NAME_) out=Want;
run;
It is very easy for IML code.
data have;
input (v1-v12) ($);
cards;
X X X X Y Y Y Y Z Z Z Z
;
run;
proc iml;
use have;
read all var _all_ into x;
close;
want=shapecol(x,0,3);
create want from want;
append from want;
close;
quit;
Thanks guys - I've tried various versions of the code everyone has given me, but it's still not quite what I need. Perhaps it's because I'm not very advanced with SAS, but let me reiterate what I'm looking for because I must've not explained well. My data structure is like this
X . . . X X Y . Y Y . . Z . . Z Z
I need it to be like this
X Y Z
X Y Z
X Y Z
when I use this code:
data all.ap5; set all.stage9; array vars{900} VAR11_c1-VAR11_c900; obs = _n_; do i = 1 to 900; var11 = vars{i}; if not missing(var11) then output all.ap5; end; drop i var8_c1-var8_c1260 VAR11_c1-VAR11_c1260 VAR12_c1-VAR12_c1260 Participant__Nation_c1-Participant__Nation_c1260 var25_c1-var25_c1260 var26_c1-var26_c1260 VAR32_c1-VAR32_c1260 VAR34_c1-VAR34_c1260 VAR35_c1-VAR35_c1260 VAR36_c1-VAR36_c1260 VAR37_c1-VAR37_c1260 VAR50_c1-VAR50_c1260 VAR54_c1-VAR54_c1260 VAR55_c1-VAR55_c1260 VAR61_c1-VAR61_c1260; run;
It gives me exactly what I need. However, instead of doing it for just one variable. I want to do it for the three variables at the same time, so I tried something like:
data all.test; set all.stage8; array vars{1440} VAR55_c1-VAR55_c1440; array vars1{1440} VAR61_c1-VAR61_c1440; obs = _n_; do i = 1 to 1440; var55 = vars{i}; if not missing(var55) then output all.test; end; do i = 1 to 1440; var61 = vars1{i}; if not missing(var61) then output all.test; end; drop i VAR55_c1-VAR55_c1440 VAR61_c1-VAR61_c1440; run;
However, that does not work how I need it to..
If someone could point me in the right direction that would be great.
The variables I'm arraying are character variables with either a "Yes" or "No" as their values.
Thank you,
R
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.