Help using Base SAS procedures

Proc transpose

Accepted Solution Solved
Reply
Regular Contributor
Posts: 216
Accepted Solution

Proc transpose

Hi

I have the below Proc Transpose statement. The Id variable, exitMonthCategory, is numeric (1, 2, 3, 4, 5, 6, 12, 18, 24, 30.....66). When I run the transpose, the columns are appearing out of order: 6, 12, 24, 30, 36, 1, 2, 3, 4, 5, 18, 54. Is there a way I can order this to appear ascending?

Paul

proc transpose data=s1age1 out=s1age2;

  by cnty_name startyear agecat4 exit;

  id exitMonthCategory;

  run;


Accepted Solutions
Solution
‎11-12-2012 11:35 AM
PROC Star
Posts: 7,356

Re: Proc transpose

Paul,

Do you get what you want if you include the 'sparse' option in your proc freq statement?  e.g.:

options validvarname=v7;

PROC IMPORT OUT= WORK.s1ageshort

            DATAFILE= "D:\art\agerange-posting2.xlsx"

            DBMS=XLSX REPLACE;

     GETNAMES=YES;

     DATAROW=2;

RUN;

proc freq data=s1ageshort noprint;

  tables cnty_name*startyear*exit*agecat4*exitMonthCategory/out=s1ageshorttest sparse;

run;

proc transpose data=s1ageshorttest out=want (drop=_label_

            rename=(_name_=DataElement));

  by cnty_name startyear exit agecat4;

  id exitMonthCategory;

run;

options validvarname=any;

data want;

  set want (rename=(

         _1='1 Month'n

         _2='2 Months'n

         _3='3 Months'n

         _4='4 Months'n

         _5='5 Months'n

         _6='6 Months'n

         _12='7-12 Months'n

         _18='1-1.5 Years'n

         _24='>1.51-1.99 Years'n

         _30='2-2.5 Years'n

         _36='2.51-2.99 Years'n

         _42='3-3.5 Years'n

         _48='3.51-3.99 Years'n

         _54='4-4.5 Years'n

         _60='4.51-4.99 Years'n

         _66='5-5.5 Years'n

         _72='>5.5 Years'n));

run;

options validvarname=v7;

View solution in original post


All Replies
Respected Advisor
Posts: 3,777

Re: Proc transpose

Something like this will work to order the value in a data driven program

data frame;
   do exitMonthCategory = 1 to 5, 6 to 66 by 6;
     
output;
     
end;
  
run;

data s1age1V / view=s1age1v;
   set frame s1age1;
   run;
proc transpose data=s1age1v
  
out=s1age2(where=(not missing(cnty_name)))

Regular Contributor
Posts: 216

Re: Proc transpose

I don't completely understand the above code, but when I run it, I am getting the below error on the transpose statement:

ERROR: Variable cnty_name is not on file WORK.S1AGE2.

When I look at both S1AGE2 and s1age1v, cnty_name is there?


Paul

Respected Advisor
Posts: 3,777

Re: Proc transpose

The final proc transpose is not complete.  You did not supply sample data so a complete example was not possible without me creating the sample data.  Show your work.

Regular Contributor
Posts: 216

Re: Proc transpose

I don't know how to attach files to the posting, but below is a link to a zip file containing data:

http://www.nycourts.gov/surveys/cwcip/agerange-posting2.zip

Respected Advisor
Posts: 3,777

Re: Proc transpose

What is the variable to be transposed?

data s1age1;
   infile cards expandtabs;
  
input cnty_name:$10.   agecat4  exit:$3.   startyear exitMonthCategory;
   cards;
County-XXTT 4  XRF   2007  5
County-XXTT 4  XRF   2007  4
County-XXTT 3  ZTC   2009  42
County-XXTT 3  ZTC   2009  42
County-XXTT 4  ZTC   2010  30
County-XXTT 4  ZTC   2009  42
County-XXTT 4  XRF   2009  30
County-XXTT 3  XRF   2009  30
County-XXTT 4  XRF   2011  1
County-XXTT 3  XRF   2006  18
County-XXTT 3  XRF   2006  5
County-XXTT 4  XRF   2006  12
County-XXTT 4  ZTC   2007  66
County-XXTT 4  ZTC   2008  54
County-XXTT 3  XRF   2007  24
County-XXTT 2  XRF   2007  24
County-XXTT 3  ZTC   2010  24
County-XXTT 3  ZTC   2010  24
County-XXTT 3  XOT   2008  54
County-XXTT 4  XRF   2008  2
County-XXTT 2  XRF   2006  12
County-XXTT 3  XRF   2009  12
County-XXTT 3  XRF   2011  1
County-XXTT 3  XRF   2008  6
County-XXTT 4  ZTC   2010  30
County-XXTT 4  XRF   2007  2
County-XXTT 3  XRF   2006  2
County-XXTT 3  XRF   2006  1
County-XXTT 2  XRF   2006  1
County-XXTT 3  XRF   2008  1
County-XXTT 4  XRF   2010  1
County-XXTT 4  XRF   2007  2
County-XXTT 3  XRF   2007  12
County-XXTT 4  XRF   2009  42
County-XXTT 4  XRF   2009  18
;;;;
   run;


data frame;
   if 0 then set s1age1(keep=exitMonthCategory);
   do exitMonthCategory = 1 to 5, 6 to 66 by 6;
     
output;
     
end;
  
run;

data s1age1V / view=s1age1v;
   set frame s1age1;
   run;
proc transpose data=s1age1v
  
out=s1age2(where=(not missing(cnty_name)));
   by cnty_name agecat4   exit startyear notsorted
  
id exitMonthCategory;
   var exitMonthCategory /*?*/;
  
run;

Regular Contributor
Posts: 216

Re: Proc transpose

Yes, the transposed variable is exitMonthCategory . However, when I ran this code, I am getting the below errors.

27        by cnty_name agecat4   exit startyear notsorted;
28        id exitMonthCategory;
29        var exitMonthCategory /*?*/;
30        run;

ERROR: The ID value "'60'n" occurs twice in the same BY group.

ERROR: The ID value "'60'n" occurs twice in the same BY group.

NOTE: The above message was for the following BY group:

  cnty_name=Dutchess agecat4=3 exit=XCA startyear=2006

ERROR: The ID value "'18'n" occurs twice in the same BY group.

NOTE: The above message was for the following BY group:

  cnty_name=Dutchess agecat4=4 exit=XRF startyear=2009

ERROR: The ID value "'24'n" occurs twice in the same BY group.

ERROR: The ID value "'24'n" occurs twice in the same BY group.

NOTE: The above message was for the following BY group:

  cnty_name=Orange agecat4=4 exit=ZTC startyear=2010

ERROR: The ID value "'1'n" occurs twice in the same BY group.

NOTE: The above message was for the following BY group:

  cnty_name=Orange agecat4=4 exit=XOT startyear=2009

ERROR: The ID value "'66'n" occurs twice in the same BY group.

NOTE: The above message was for the following BY group:

  cnty_name=Orange agecat4=4 exit=XRM startyear=2006

ERROR: The ID value "'36'n" occurs twice in the same BY group.

NOTE: The above message was for the following BY group:

  cnty_name=Orange agecat4=3 exit=XRF startyear=2007

ERROR: The ID value "'18'n" occurs twice in the same BY group.

NOTE: The above message was for the following BY group:

  cnty_name=Orange agecat4=4 exit=XRF startyear=2006

ERROR: The ID value "'18'n" occurs twice in the same BY group.

NOTE: The above message was for the following BY group:

  cnty_name=Orange agecat4=3 exit=XRF startyear=2006

ERROR: The ID value "'12'n" occurs twice in the same BY group.

NOTE: The above message was for the following BY group:

  cnty_name=Orange agecat4=3 exit=XRF startyear=2008

ERROR: The ID value "'24'n" occurs twice in the same BY group.

NOTE: The above message was for the following BY group:

  cnty_name=Dutchess agecat4=4 exit=ZTC startyear=2010

ERROR: The ID value "'2'n" occurs twice in the same BY group.

ERROR: Too many bad BY groups.

NOTE: View WORK.S1AGE1V.VIEW used (Total process time):

  real time       0.04 seconds
  cpu time        0.03 seconds
 
Respected Advisor
Posts: 3,777

Re: Proc transpose

Your ID's are not unique as described by the BY.  You will need determine how that should be resolved.  You should have had the same issue with your original proc transpose.

The only difference I am suggesting is to create observations that will establish the variable names as created by ID, In the order you want.

PROC Star
Posts: 7,356

Re: Proc transpose

Paul: I think you are trying to run the transpose without having first run the proc freq.  The following works for me:

PROC IMPORT OUT= WORK.have

            DATAFILE= "D:\art\agerange-posting2.xlsx"

            DBMS=XLSX REPLACE;

     GETNAMES=YES;

     DATAROW=2;

RUN;

proc freq data=have noprint;

  tables cnty_name*agecat4*exitMonthCategory/out=need;

run;

proc transpose data=need out=want (drop=_label_

            rename=(_name_=DataElement));

  by cnty_name agecat4;

  id exitMonthCategory;

run;

options validvarname=any;

data want;

  set want (rename=(

         _1='1 Month'n

         _2='2 Months'n

         _3='3 Months'n

         _4='4 Months'n

         _5='5 Months'n

         _6='6 Months'n

         _12='7-12 Months'n

         _18='1-1.5 Years'n

         _24='>1.51-1.99 Years'n

         _30='2-2.5 Years'n

         _36='2.51-2.99 Years'n

         _42='3-3.5 Years'n

         _48='3.51-3.99 Years'n

         _54='4-4.5 Years'n

         _60='4.51-4.99 Years'n

         _66='5-5.5 Years'n

         _72='>5.5 Years'n));

run;

Regular Contributor
Posts: 216

Re: Proc transpose

Hi Art

I think you omitted a few variables above and when you add those back in, the columns are still ordered incorrectly (i.e. 6-48, 66, 18, 54, 1, 2, 3......).

I tried sorting the data prior to running the proc freq and using the ORDER=DATA option in proc freq, but that did not work either.

Paul

proc freq data=s1ageshort noprint;

  tables cnty_name*startyear*exit*agecat4*exitMonthCategory/out=s1ageshorttest;

run;

proc transpose data=s1ageshorttest out=s1ageshorttest2 (drop=_label_

            rename=(_name_=DataElement));

  by cnty_name startyear exit agecat4;

  id exitMonthCategory;

run;

options validvarname=any;

Solution
‎11-12-2012 11:35 AM
PROC Star
Posts: 7,356

Re: Proc transpose

Paul,

Do you get what you want if you include the 'sparse' option in your proc freq statement?  e.g.:

options validvarname=v7;

PROC IMPORT OUT= WORK.s1ageshort

            DATAFILE= "D:\art\agerange-posting2.xlsx"

            DBMS=XLSX REPLACE;

     GETNAMES=YES;

     DATAROW=2;

RUN;

proc freq data=s1ageshort noprint;

  tables cnty_name*startyear*exit*agecat4*exitMonthCategory/out=s1ageshorttest sparse;

run;

proc transpose data=s1ageshorttest out=want (drop=_label_

            rename=(_name_=DataElement));

  by cnty_name startyear exit agecat4;

  id exitMonthCategory;

run;

options validvarname=any;

data want;

  set want (rename=(

         _1='1 Month'n

         _2='2 Months'n

         _3='3 Months'n

         _4='4 Months'n

         _5='5 Months'n

         _6='6 Months'n

         _12='7-12 Months'n

         _18='1-1.5 Years'n

         _24='>1.51-1.99 Years'n

         _30='2-2.5 Years'n

         _36='2.51-2.99 Years'n

         _42='3-3.5 Years'n

         _48='3.51-3.99 Years'n

         _54='4-4.5 Years'n

         _60='4.51-4.99 Years'n

         _66='5-5.5 Years'n

         _72='>5.5 Years'n));

run;

options validvarname=v7;

Regular Contributor
Posts: 216

Re: Proc transpose

OMG, that really worked! Thank a lot Art.

Now its the percents issue and trying to calculate the correct percent and cumulative percent based on the vars in the data set. This is the issue in other thread. The percents that proc freq is calculating seem to be half of what they should be. I am answering your post in the 'proc freq: percents' thread.

Paul

Regular Contributor
Posts: 216

Re: Proc transpose

After reading about 'sparse' it doesn't seem like that would lend itself to a sort.

PROC Star
Posts: 7,356

Re: Proc transpose

Not sure what you mean.  Take a look at: http://www.nesug.org/proceedings/nesug02/cc/cc003.pdf

PROC Star
Posts: 7,356

Re: Proc transpose

Paul,

Without an example dataset I can't test DN's suggestion.  If you had used the code I had suggested in a previous thread, namely using the prefix option and idlabel rather than id, the data would have come out in the right order.

However, that said, since you will always have the same range of categories (which I don't know what they are but will presume they end up being something like _1 to _20), you could just add an extra datastep after your use of proc transpose.  e.g.:

proc transpose data=s1age1 out=s1age2;

  by cnty_name startyear agecat4 exit;

  id exitMonthCategory;

  run;

data s1age2;

  retain cnty_name startyear agecat4 exit _1-_20;

  set s1age2;

run;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 16 replies
  • 1689 views
  • 0 likes
  • 3 in conversation