Hi
I have the below Proc Transpose statement. The Id variable, exitMonthCategory, is numeric (1, 2, 3, 4, 5, 6, 12, 18, 24, 30.....66). When I run the transpose, the columns are appearing out of order: 6, 12, 24, 30, 36, 1, 2, 3, 4, 5, 18, 54. Is there a way I can order this to appear ascending?
Paul
proc transpose data=s1age1 out=s1age2;
by cnty_name startyear agecat4 exit;
id exitMonthCategory;
run;
Paul,
Do you get what you want if you include the 'sparse' option in your proc freq statement? e.g.:
options validvarname=v7;
PROC IMPORT OUT= WORK.s1ageshort
DATAFILE= "D:\art\agerange-posting2.xlsx"
DBMS=XLSX REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
proc freq data=s1ageshort noprint;
tables cnty_name*startyear*exit*agecat4*exitMonthCategory/out=s1ageshorttest sparse;
run;
proc transpose data=s1ageshorttest out=want (drop=_label_
rename=(_name_=DataElement));
by cnty_name startyear exit agecat4;
id exitMonthCategory;
run;
options validvarname=any;
data want;
set want (rename=(
_1='1 Month'n
_2='2 Months'n
_3='3 Months'n
_4='4 Months'n
_5='5 Months'n
_6='6 Months'n
_12='7-12 Months'n
_18='1-1.5 Years'n
_24='>1.51-1.99 Years'n
_30='2-2.5 Years'n
_36='2.51-2.99 Years'n
_42='3-3.5 Years'n
_48='3.51-3.99 Years'n
_54='4-4.5 Years'n
_60='4.51-4.99 Years'n
_66='5-5.5 Years'n
_72='>5.5 Years'n));
run;
options validvarname=v7;
Something like this will work to order the value in a data driven program
data frame;
do exitMonthCategory = 1 to 5, 6 to 66 by 6;
output;
end;
run;
data s1age1V / view=s1age1v;
set frame s1age1;
run;
proc transpose data=s1age1v
out=s1age2(where=(not missing(cnty_name)))
I don't completely understand the above code, but when I run it, I am getting the below error on the transpose statement:
ERROR: Variable cnty_name is not on file WORK.S1AGE2.
When I look at both S1AGE2 and s1age1v, cnty_name is there?
Paul
The final proc transpose is not complete. You did not supply sample data so a complete example was not possible without me creating the sample data. Show your work.
I don't know how to attach files to the posting, but below is a link to a zip file containing data:
What is the variable to be transposed?
data s1age1;
infile cards expandtabs;
input cnty_name:$10. agecat4 exit:$3. startyear exitMonthCategory;
cards;
County-XXTT 4 XRF 2007 5
County-XXTT 4 XRF 2007 4
County-XXTT 3 ZTC 2009 42
County-XXTT 3 ZTC 2009 42
County-XXTT 4 ZTC 2010 30
County-XXTT 4 ZTC 2009 42
County-XXTT 4 XRF 2009 30
County-XXTT 3 XRF 2009 30
County-XXTT 4 XRF 2011 1
County-XXTT 3 XRF 2006 18
County-XXTT 3 XRF 2006 5
County-XXTT 4 XRF 2006 12
County-XXTT 4 ZTC 2007 66
County-XXTT 4 ZTC 2008 54
County-XXTT 3 XRF 2007 24
County-XXTT 2 XRF 2007 24
County-XXTT 3 ZTC 2010 24
County-XXTT 3 ZTC 2010 24
County-XXTT 3 XOT 2008 54
County-XXTT 4 XRF 2008 2
County-XXTT 2 XRF 2006 12
County-XXTT 3 XRF 2009 12
County-XXTT 3 XRF 2011 1
County-XXTT 3 XRF 2008 6
County-XXTT 4 ZTC 2010 30
County-XXTT 4 XRF 2007 2
County-XXTT 3 XRF 2006 2
County-XXTT 3 XRF 2006 1
County-XXTT 2 XRF 2006 1
County-XXTT 3 XRF 2008 1
County-XXTT 4 XRF 2010 1
County-XXTT 4 XRF 2007 2
County-XXTT 3 XRF 2007 12
County-XXTT 4 XRF 2009 42
County-XXTT 4 XRF 2009 18
;;;;
run;
data frame;
if 0 then set s1age1(keep=exitMonthCategory);
do exitMonthCategory = 1 to 5, 6 to 66 by 6;
output;
end;
run;
data s1age1V / view=s1age1v;
set frame s1age1;
run;
proc transpose data=s1age1v
out=s1age2(where=(not missing(cnty_name)));
by cnty_name agecat4 exit startyear notsorted;
id exitMonthCategory;
var exitMonthCategory /*?*/;
run;
Yes, the transposed variable is exitMonthCategory . However, when I ran this code, I am getting the below errors.
27 | by cnty_name agecat4 exit startyear notsorted; |
28 | id exitMonthCategory; |
29 | var exitMonthCategory /*?*/; |
30 | run; |
ERROR: The ID value "'60'n" occurs twice in the same BY group.
ERROR: The ID value "'60'n" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
cnty_name=Dutchess agecat4=3 exit=XCA startyear=2006 |
ERROR: The ID value "'18'n" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
cnty_name=Dutchess agecat4=4 exit=XRF startyear=2009 |
ERROR: The ID value "'24'n" occurs twice in the same BY group.
ERROR: The ID value "'24'n" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
cnty_name=Orange agecat4=4 exit=ZTC startyear=2010 |
ERROR: The ID value "'1'n" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
cnty_name=Orange agecat4=4 exit=XOT startyear=2009 |
ERROR: The ID value "'66'n" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
cnty_name=Orange agecat4=4 exit=XRM startyear=2006 |
ERROR: The ID value "'36'n" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
cnty_name=Orange agecat4=3 exit=XRF startyear=2007 |
ERROR: The ID value "'18'n" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
cnty_name=Orange agecat4=4 exit=XRF startyear=2006 |
ERROR: The ID value "'18'n" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
cnty_name=Orange agecat4=3 exit=XRF startyear=2006 |
ERROR: The ID value "'12'n" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
cnty_name=Orange agecat4=3 exit=XRF startyear=2008 |
ERROR: The ID value "'24'n" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
cnty_name=Dutchess agecat4=4 exit=ZTC startyear=2010 |
ERROR: The ID value "'2'n" occurs twice in the same BY group.
ERROR: Too many bad BY groups.
NOTE: View WORK.S1AGE1V.VIEW used (Total process time):
real time | 0.04 seconds | |
cpu time | 0.03 seconds | |
Your ID's are not unique as described by the BY. You will need determine how that should be resolved. You should have had the same issue with your original proc transpose.
The only difference I am suggesting is to create observations that will establish the variable names as created by ID, In the order you want.
Paul: I think you are trying to run the transpose without having first run the proc freq. The following works for me:
PROC IMPORT OUT= WORK.have
DATAFILE= "D:\art\agerange-posting2.xlsx"
DBMS=XLSX REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
proc freq data=have noprint;
tables cnty_name*agecat4*exitMonthCategory/out=need;
run;
proc transpose data=need out=want (drop=_label_
rename=(_name_=DataElement));
by cnty_name agecat4;
id exitMonthCategory;
run;
options validvarname=any;
data want;
set want (rename=(
_1='1 Month'n
_2='2 Months'n
_3='3 Months'n
_4='4 Months'n
_5='5 Months'n
_6='6 Months'n
_12='7-12 Months'n
_18='1-1.5 Years'n
_24='>1.51-1.99 Years'n
_30='2-2.5 Years'n
_36='2.51-2.99 Years'n
_42='3-3.5 Years'n
_48='3.51-3.99 Years'n
_54='4-4.5 Years'n
_60='4.51-4.99 Years'n
_66='5-5.5 Years'n
_72='>5.5 Years'n));
run;
Hi Art
I think you omitted a few variables above and when you add those back in, the columns are still ordered incorrectly (i.e. 6-48, 66, 18, 54, 1, 2, 3......).
I tried sorting the data prior to running the proc freq and using the ORDER=DATA option in proc freq, but that did not work either.
Paul
proc freq data=s1ageshort noprint;
tables cnty_name*startyear*exit*agecat4*exitMonthCategory/out=s1ageshorttest;
run;
proc transpose data=s1ageshorttest out=s1ageshorttest2 (drop=_label_
rename=(_name_=DataElement));
by cnty_name startyear exit agecat4;
id exitMonthCategory;
run;
options validvarname=any;
Paul,
Do you get what you want if you include the 'sparse' option in your proc freq statement? e.g.:
options validvarname=v7;
PROC IMPORT OUT= WORK.s1ageshort
DATAFILE= "D:\art\agerange-posting2.xlsx"
DBMS=XLSX REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
proc freq data=s1ageshort noprint;
tables cnty_name*startyear*exit*agecat4*exitMonthCategory/out=s1ageshorttest sparse;
run;
proc transpose data=s1ageshorttest out=want (drop=_label_
rename=(_name_=DataElement));
by cnty_name startyear exit agecat4;
id exitMonthCategory;
run;
options validvarname=any;
data want;
set want (rename=(
_1='1 Month'n
_2='2 Months'n
_3='3 Months'n
_4='4 Months'n
_5='5 Months'n
_6='6 Months'n
_12='7-12 Months'n
_18='1-1.5 Years'n
_24='>1.51-1.99 Years'n
_30='2-2.5 Years'n
_36='2.51-2.99 Years'n
_42='3-3.5 Years'n
_48='3.51-3.99 Years'n
_54='4-4.5 Years'n
_60='4.51-4.99 Years'n
_66='5-5.5 Years'n
_72='>5.5 Years'n));
run;
options validvarname=v7;
OMG, that really worked! Thank a lot Art.
Now its the percents issue and trying to calculate the correct percent and cumulative percent based on the vars in the data set. This is the issue in other thread. The percents that proc freq is calculating seem to be half of what they should be. I am answering your post in the 'proc freq: percents' thread.
Paul
After reading about 'sparse' it doesn't seem like that would lend itself to a sort.
Not sure what you mean. Take a look at: http://www.nesug.org/proceedings/nesug02/cc/cc003.pdf
Paul,
Without an example dataset I can't test DN's suggestion. If you had used the code I had suggested in a previous thread, namely using the prefix option and idlabel rather than id, the data would have come out in the right order.
However, that said, since you will always have the same range of categories (which I don't know what they are but will presume they end up being something like _1 to _20), you could just add an extra datastep after your use of proc transpose. e.g.:
proc transpose data=s1age1 out=s1age2;
by cnty_name startyear agecat4 exit;
id exitMonthCategory;
run;
data s1age2;
retain cnty_name startyear agecat4 exit _1-_20;
set s1age2;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.