Transpose a single column, by multiple columns, across multiple columns.

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

Transpose a single column, by multiple columns, across multiple columns.

Relatively new to SAS, and I feel like this should be easy but I'm having trouble.

Current Data

StateMonthAGEABCDE
IA2012070 - 50390119158
IA2012080 - 50470122169
IA2012090 - 50330139172
IA20120716 - 2001006373
IA20120816 - 2001404963
IA20120916 - 200604147


I need to transpose the Month column BY State AGE, but maintain A-E as sub categories.  I do not care how the report is formatted, and column names are also an after-thought.  HOWEVER, the Month field will be variable - so I require a solution that involves no hard code referencing the contents of the month column.

So, (I did this manually in excel) I need:

StateAGEA_201207B_201207C_201207D_201207E_201207A_201208B_201208C_201208D_201208E_201208A_201209B_201209C_201209D_201209E_201209
IA0 - 5039011915804701221690330139172
IA16 - 2001006373014049630604147

Thanks,

The Guy


Accepted Solutions
Solution
‎10-09-2012 12:26 PM
Super User
Posts: 19,815

Re: Transpose a single column, by multiple columns, across multiple columns.

In two steps...

data have;

input State $    Month $     AGE $    A    B    C    D    E;

cards;

IA    201207    0-5    0    39    0    119    158

IA    201208    0-5    0    47    0    122    169

IA    201209    0-5    0    33    0    139    172

IA    201207    16-20    0    10    0    63    73

IA    201208    16-20    0    14    0    49    63

IA    201209    16-20    0    6    0    41    47

;

run;

data step1;

    set have;

    array lett{*} a--e;

    do i=1 to dim(lett);

        label=vname(lett(i))||"_"||month;

        value=lett(i);

        output;

    end;

run;

proc transpose data=step1 out=step2;

    by state age;

    id label;

    var value;

run;

View solution in original post


All Replies
Super User
Posts: 5,509

Re: Transpose a single column, by multiple columns, across multiple columns.

Guy,

Since you're relatively new to SAS, I'll skip trying to automate this.  Macro language would come in handy, but it's probably too great a leap at this point.

It looks like you are willing to manually change names in Excel, so there can't be too many changes to make.  Here's the plan to have SAS do the job.

data m201207 (rename=(a=A_201207 b=B_201207 c=C_201207 d=D_201207 e=E_201207))

       m201208 (rename=(a=A_201208 b=B_201208 c=C_201208 d=D_201208 e=E_201208))

       etc. for each month

       ;

   set current_data;

   select (month);

       when (201207) output m_201207;

       when (201208) output m_201208;

       etc. for each month;

   end;

   drop month;

run;

This gives you a separate data set for each month, with variables already renamed.  If MONTH is actually a character variable instead of numeric, you will need to refer to its values in quotes such as when ('201207').

If the data are not already sorted, sort each data set BY STATE AGE;

Then combine them:

data want;

   merge m_201207 m_201208 etc.;

   by state age;

run;

That should take care of it.  You may see more automated solutions ... that's probably a good idea if you need to do this a lot.  But this version will get you started.

Occasional Contributor
Posts: 8

Re: Transpose a single column, by multiple columns, across multiple columns.

Posted in reply to Astounding

Thanks for the reply, but I am seeking a more automated solution.  I forgot to mention that the month field will be variable as to which months will be included, and how many there are.

I'm fine with macro syntax - I have the option of coding this separation in the macro used to bring the data together (in which case I might be able to use the code you've provided), but that would require maintaining two marcos, two data sets, and two versions of all the subsequent code.

What I've read has come close to what I need, but not quite there.  I was hoping for some kind of array manipulation, tidy DO-LOOP, or tricky PROC TRANSPOSE statement.  Like I said, I was thinking this wouldn't be that difficult but if I'm completely wrong on that point I'll have to accept it.

Thanks again.

Solution
‎10-09-2012 12:26 PM
Super User
Posts: 19,815

Re: Transpose a single column, by multiple columns, across multiple columns.

In two steps...

data have;

input State $    Month $     AGE $    A    B    C    D    E;

cards;

IA    201207    0-5    0    39    0    119    158

IA    201208    0-5    0    47    0    122    169

IA    201209    0-5    0    33    0    139    172

IA    201207    16-20    0    10    0    63    73

IA    201208    16-20    0    14    0    49    63

IA    201209    16-20    0    6    0    41    47

;

run;

data step1;

    set have;

    array lett{*} a--e;

    do i=1 to dim(lett);

        label=vname(lett(i))||"_"||month;

        value=lett(i);

        output;

    end;

run;

proc transpose data=step1 out=step2;

    by state age;

    id label;

    var value;

run;

Occasional Contributor
Posts: 8

Re: Transpose a single column, by multiple columns, across multiple columns.

An array, a tidy do-loop, AND a proc transpose statement - works like a charm!  Thanks!

Respected Advisor
Posts: 3,799

Re: Transpose a single column, by multiple columns, across multiple columns.

For more complete automation or data driven use two PROC TRANSPOSES.

data state;
   input State:$2. Month :B8601DA. AGE&$10.  A  B  C  D  E;
   format month yymmn.;
  
cards;
IA 201207   0 - 5  0  39 0  119   158
IA 201208   0 - 5  0  47 0  122   169
IA 201209   0 - 5  0  33 0  139   172
IA 201207   16 - 20  0  10 0  63 73
IA 201208   16 - 20  0  14 0  49 63
IA 201209   16 - 20  0  6  0  41 47
;;;;
run;
proc transpose data=state out=state2;
   by state age month;
   run;
proc transpose data=state2 out=state3 delim=_;
   by state age;
   id _name_ month;
   var col1;
   run;
Super Contributor
Posts: 1,636

Re: Transpose a single column, by multiple columns, across multiple columns.

Posted in reply to data_null__

Hi DN,

I like your new method.  I tried the proc summary method you and Mike used, but I don't know how to add month to the variable names. Do you know how?

proc summary nway data=state missing;

class state age;

output

out = work.wide(drop=_type_ _freq_)

idgroup(out[3](a--e)=)

;

run;

when I ran your code to create dataset, I got errors:

data state;
   input State:$2. Month :B8601DA. AGE&$10.  A  B  C  D  E;
   format month yymmn.;
   cards;
IA 201207   0 - 5  0  39 0  119   158
IA 201208   0 - 5  0  47 0  122   169
IA 201209   0 - 5  0  33 0  139   172
IA 201207   16 - 20  0  10 0  63 73
IA 201208   16 - 20  0  14 0  49 63
IA 201209   16 - 20  0  6  0  41 47
;;;;
run;

my log file:

273  data state;
274     input State:$2. Month :  AGE &$10.  A  B  C  D  E;
275     cards;

NOTE: The data set WORK.STATE has 6 observations and 8 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


282  ;;;;
283  run;

284  data state;
285     input State:$2. Month :B8601DA. AGE&$10.  A  B  C  D  E;
286     format month yymmn.;
287     cards;

NOTE: Invalid data for Month in line 288 4-9.
RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--
288        IA 201207   0 - 5  0  39 0  119   158
State=IA Month=. AGE=0 - 5 A=0 B=39 C=0 D=119 E=158 _ERROR_=1 _N_=1
NOTE: Invalid data for Month in line 289 4-9.
289        IA 201208   0 - 5  0  47 0  122   169
State=IA Month=. AGE=0 - 5 A=0 B=47 C=0 D=122 E=169 _ERROR_=1 _N_=2
NOTE: Invalid data for Month in line 290 4-9.
290        IA 201209   0 - 5  0  33 0  139   172
State=IA Month=. AGE=0 - 5 A=0 B=33 C=0 D=139 E=172 _ERROR_=1 _N_=3
NOTE: Invalid data for Month in line 291 4-9.
291        IA 201207   16 - 20  0  10 0  63 73
State=IA Month=. AGE=16 - 20 A=0 B=10 C=0 D=63 E=73 _ERROR_=1 _N_=4
NOTE: Invalid data for Month in line 292 4-9.
292        IA 201208   16 - 20  0  14 0  49 63
State=IA Month=. AGE=16 - 20 A=0 B=14 C=0 D=49 E=63 _ERROR_=1 _N_=5
NOTE: Invalid data for Month in line 293 4-9.
RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--
293        IA 201209   16 - 20  0  6  0  41 47
State=IA Month=. AGE=16 - 20 A=0 B=6 C=0 D=41 E=47 _ERROR_=1 _N_=6
NOTE: The data set WORK.STATE has 6 observations and 8 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.03 seconds


294  ;;;;
295  run;

Respected Advisor
Posts: 3,799

Re: Transpose a single column, by multiple columns, across multiple columns.

You can't do an IDed transpose with PROC SUMMARY.

Super Contributor
Posts: 1,636

Re: Transpose a single column, by multiple columns, across multiple columns.

Posted in reply to data_null__

Thank you DN! Why I can't use the :B8601DA informat (see my previous post)? 

Respected Advisor
Posts: 3,799

Re: Transpose a single column, by multiple columns, across multiple columns.

I don't know why the informat does not work.  I am using SAS 9.3 what about you?

Super Contributor
Posts: 1,636

Re: Transpose a single column, by multiple columns, across multiple columns.

Posted in reply to data_null__

I use 9.3 too. Thank you!

Respected Advisor
Posts: 3,799

Re: Transpose a single column, by multiple columns, across multiple columns.

Interesting.  Test the informat with a complete date.  Seems like it should work for you.

Super Contributor
Posts: 1,636

Re: Transpose a single column, by multiple columns, across multiple columns.

Posted in reply to data_null__

It worked fine after I added '01' to  'month':

data state;

   input State:$2. Month :B8601DA. AGE&$10.  A  B  C  D  E;

   format month yymmn.;

   cards;

IA 20120701   0 - 5  0  39 0  119   158

IA 20120801   0 - 5  0  47 0  122   169

IA 20120901   0 - 5  0  33 0  139   172

IA 20120701   16 - 20  0  10 0  63 73

IA 20120801   16 - 20  0  14 0  49 63

IA 20120901   16 - 20  0  6  0  41 47

;;;;

run;

proc transpose data=state out=state2;

   by state age month;

   run;

proc transpose data=state2 out=state3 delim=_;

   by state age;

   id _name_ month;

   var col1;

   run;

PROC Star
Posts: 7,474

Re: Transpose a single column, by multiple columns, across multiple columns.

Linlin:  Your log was confusing.  First it showed that the initial datastep worked, and then it appeared like a second run of the same datastep didn't work.  I could, of course, just have been confused.

Super Contributor
Posts: 1,636

Re: Transpose a single column, by multiple columns, across multiple columns.

Hi Art,

Below is from my log file after I restarted  sas:

NOTE: SAS initialization used:
      real time           6.09 seconds
      cpu time            1.59 seconds

1    data state;
2       input State:$2. Month :B8601DA. AGE&$10.  A  B  C  D  E;
3       format month yymmn.;
4       cards;

NOTE: Invalid data for Month in line 5 4-9.
RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--
5          IA 201207   0 - 5  0  39 0  119   158
State=IA Month=. AGE=0 - 5 A=0 B=39 C=0 D=119 E=158 _ERROR_=1 _N_=1
NOTE: Invalid data for Month in line 6 4-9.
6          IA 201208   0 - 5  0  47 0  122   169
State=IA Month=. AGE=0 - 5 A=0 B=47 C=0 D=122 E=169 _ERROR_=1 _N_=2
NOTE: Invalid data for Month in line 7 4-9.
7          IA 201209   0 - 5  0  33 0  139   172
State=IA Month=. AGE=0 - 5 A=0 B=33 C=0 D=139 E=172 _ERROR_=1 _N_=3
NOTE: Invalid data for Month in line 8 4-9.
8          IA 201207   16 - 20  0  10 0  63 73
State=IA Month=. AGE=16 - 20 A=0 B=10 C=0 D=63 E=73 _ERROR_=1 _N_=4
NOTE: Invalid data for Month in line 9 4-9.
9          IA 201208   16 - 20  0  14 0  49 63
State=IA Month=. AGE=16 - 20 A=0 B=14 C=0 D=49 E=63 _ERROR_=1 _N_=5
NOTE: Invalid data for Month in line 10 4-9.
10         IA 201209   16 - 20  0  6  0  41 47
State=IA Month=. AGE=16 - 20 A=0 B=6 C=0 D=41 E=47 _ERROR_=1 _N_=6
NOTE: The data set WORK.STATE has 6 observations and 8 variables.
NOTE: DATA statement used (Total process time):
      real time           0.09 seconds
      cpu time            0.03 seconds


11   ;;;;
12   run;
13   proc transpose data=state out=state2;
14      by state age month;
15      run;

NOTE: There were 6 observations read from the data set WORK.STATE.
NOTE: The data set WORK.STATE2 has 10 observations and 7 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
      real time           0.03 seconds
      cpu time            0.01 seconds


16   proc transpose data=state2 out=state3 delim=_;
17      by state age;
18      id _name_ month;
19      var col1;
20      run;

WARNING:  An ID variable is missing in every observation, all observations omitted.
NOTE: There were 10 observations read from the data set WORK.STATE2.
NOTE: The data set WORK.STATE3 has 2 observations and 3 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 18 replies
  • 806 views
  • 3 likes
  • 7 in conversation