DATA Step, Macro, Functions and more

Proc Transpose Alternative

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 86
Accepted Solution

Proc Transpose Alternative

Hi Everyone

While Transposing any vertical format dataset with proc transpose, I believe we cannot give multiple ID variables. But my problem is similar. With the By group I would like to have data with multiple ID variable to check my data. What I understood, I need to write multiple transpose procedures and then merge back. However, I am sure it might not be very efficient way to work on such problem. Please shed light on this if you know anything alternative that might be useful.

Thanks


Accepted Solutions
Solution
‎12-05-2012 02:14 PM
Respected Advisor
Posts: 3,799

Re: Proc Transpose Alternative

Given what you have shown with have and need I would do this.  It address the naming of the new variables along with ordering.  Your transpose variables are all the same type so that is not an issue.  This will work well until your data gets large when it will be slow.

data have;
   input ID $  Month:yymmn. Var1 $  Var2 $;
   format month yymmn.;
  
cards;
1     200101     SD      DE
1     200102     EF       GH
1     200103     HK       LM
2     200101     GH       LM
2     200105     JH       KJ
;;;;
   run;
proc summary data=have nway;
  
output out=month(drop=_Smiley Happy min(month)=min max(month)=max;
  
run;
proc transpose data=have(obs=0) out=vars;
   var var:;
   run;
data frame(keep=_name_ month);
   set vars;
   if _n_ eq 1 then set month;
   do m = 0 to intck('MONTH',min,max);
      month = intnx('MONTH',min,m);
      output;
     
end;
  
format month yymmn.;
  
run;

proc transpose data=have out=tall;
   by id month;
   var var:;
   run;
data tallV / view=tallV;
   set frame tall;
   run;
proc transpose data=tallV out=wide(where=(not missing(ID))) delim=_;
   by id;
   var col1;
   id _name_ month;
   run;

View solution in original post


All Replies
Super Contributor
Posts: 1,636

Re: Proc Transpose Alternative

Hi,

You could either use array or proc summary.

Super User
Super User
Posts: 7,079

Re: Proc Transpose Alternative

You can use multiple variables in the ID statement. Here is an simple example.

data tall;

  do year=2011 to 2012; do qtr=1 to 4;

    input count @@ ; output;

end; end;

cards;

1 2 3 4 5 6 7 8

run;

proc transpose data=tall out=wide prefix=Y delim=Q;

id year qtr;

var count;

run;

proc print data=wide; run;

Perhaps you want to transpose multiple VAR statements into columns rather than rows?

There is trick using the IDGROUP feature of PROC SUMMARY that DATA_NULL_ has posted here and on SAS-L.

Super User
Posts: 10,046

Re: Proc Transpose Alternative

One choice is proc means + idgroup , check it at document.

But I will use data step to catch it which is very fast.

Ksharp

Frequent Contributor
Posts: 86

Re: Proc Transpose Alternative

Hi All,

I looked for Proc means + idgroup statement as suggested by all, but I got error: I refered to this document on the web to do the same:

http://www.sascommunity.org/wiki/Transpose_using_IDGROUP_in_PROC_SUMMARY

"The parameter value 139 is not within the required range of 1 and 100" .

Please help on this,

Respected Advisor
Posts: 3,799

Re: Proc Transpose Alternative

139 is not going to work.  The value in the OUT operand of the IDGROUP option must be between 1 and 100.

Can you explain your requirement in more detail?  Also, post some sample data using the have/need model.

Frequent Contributor
Posts: 86

Re: Proc Transpose Alternative

Posted in reply to data_null__

Hi Data_null, So basically, I have a dataset like this:

ID   Month       Var1    Var2

1     200101     SD      DE

1     200102     EF       GH

1     200103     HK       LM

2     200101     GH       LM

2     200105     JH       KJ

Now Here I would like to change the format of the current dataset in the following way:

ID   Var1_200101  Var1_200102   Var1_200103   Var1_200104   Var1_200105  Var2_200101  Var2_200102   Var2_200103   Var2_200104   Var2_200105

1      SD                  EF                  HK                   .                      .                 DE                   GH              LM                  .                        .

2      GH                  .                        .                                          JH                LM                                                                                 KJ

This can be achieved by writing multiple Transpose procedures and then Merge back with ID. 2nd way as mentioned above by Proc Means. It worked well but in cases when IDGROUP option is not within 1 - 100 , this will not work which is my case unfortunately.

Thanks

Super User
Posts: 19,877

Re: Proc Transpose Alternative

I'm assuming your column identifiers will follow a date pattern, ie 201001-201012 201101-201112 and so forth.

Do you know the time period in advance or does that change? Is it set for each run?

Frequent Contributor
Posts: 86

Re: Proc Transpose Alternative

Hi Reeza,

No, the time period is not set. It can be pre-populated dataset

Super Contributor
Posts: 1,636

Re: Proc Transpose Alternative

I would first split the dataset into two, then use DN's proc summary method, then merge the final two datasets.

Solution
‎12-05-2012 02:14 PM
Respected Advisor
Posts: 3,799

Re: Proc Transpose Alternative

Given what you have shown with have and need I would do this.  It address the naming of the new variables along with ordering.  Your transpose variables are all the same type so that is not an issue.  This will work well until your data gets large when it will be slow.

data have;
   input ID $  Month:yymmn. Var1 $  Var2 $;
   format month yymmn.;
  
cards;
1     200101     SD      DE
1     200102     EF       GH
1     200103     HK       LM
2     200101     GH       LM
2     200105     JH       KJ
;;;;
   run;
proc summary data=have nway;
  
output out=month(drop=_Smiley Happy min(month)=min max(month)=max;
  
run;
proc transpose data=have(obs=0) out=vars;
   var var:;
   run;
data frame(keep=_name_ month);
   set vars;
   if _n_ eq 1 then set month;
   do m = 0 to intck('MONTH',min,max);
      month = intnx('MONTH',min,m);
      output;
     
end;
  
format month yymmn.;
  
run;

proc transpose data=have out=tall;
   by id month;
   var var:;
   run;
data tallV / view=tallV;
   set frame tall;
   run;
proc transpose data=tallV out=wide(where=(not missing(ID))) delim=_;
   by id;
   var col1;
   id _name_ month;
   run;
PROC Star
Posts: 7,492

Re: Proc Transpose Alternative

Posted in reply to data_null__

DN: As always I'm impressed with your solution.  The only thing I would suggest adding is a drop option to omit the _name_ variable in the final data step.  i.e.:

proc transpose data=tallV out=wide(drop= _: where=(not missing(ID))) delim=_;

   by id;

   var col1;

   id _name_ month;

   run;

Super User
Posts: 10,046

Re: Proc Transpose Alternative

I would do this.

data have;
   input ID $  Month:yymmn. Var1 $  Var2 $;
   format month yymmn.;
   cards;
1     200101     SD      DE
1     200102     EF       GH
1     200103     HK       LM
2     200101     GH       LM
2     200105     JH       KJ
;;;;
   run;
proc sql noprint;
 select distinct catt('have(where=(month=',month,') rename=(var1=var1_',put(month,yymmn. -l),' var2=var2_',put(month,yymmn. -l),'))') into : list separated by ' '
  from have;
quit;
data want;
 merge &list ;
 by id;
run;

Ksharp

Contributor
Posts: 45

Re: Proc Transpose Alternative

A hash-transpose. Not so elegant; way too many hard-coding of variables! Also, dates were handled as character to simplify matters. The hash variables could have been handled with macro variables, through a prior Proc SQL.

data _null_;

      if(1=2)then set have;

      length var1_200101-var1_200105 var2_200101-var2_200105 $2 ;

      declare hash varmon (ordered:"a");

      varmon.defineKey("id");

      varmon.defineData ("id","Var1_200101","Var1_200102","Var1_200103","Var1_200104","Var1_200105",

                       "Var2_200101","Var2_200102","Var2_200103","Var2_200104","Var2_200105");

      varmon.defineDone();

      call missing (of _all_);

      array w{5,2} $ _temporary_;

      do until(done);

            set have end=done;

            rcf=varmon.find(key:id);

            w(1,1)=Var1_200101;w(2,1)=Var1_200102;w(3,1)=Var1_200103;w(4,1)=Var1_200104;w(5,1)=Var1_200105;

            w(1,2)=Var2_200101;w(2,2)=Var2_200102;w(3,2)=Var2_200103;w(4,2)=Var2_200104;w(5,2)=Var2_200105;

            mx=whichc(month,"200101","200102","200103","200104","200105");

            w(mx,1)=var1;

            w(mx,2)=var2;

            rcr=varmon.replace(key:id,data:id,data:w(1,1),data:w(2,1),data:w(3,1),data:w(4,1),data:w(5,1),

                                                 data:w(1,2),data:w(2,2),data:w(3,2),data:w(4,2),data:w(5,2));

            call missing(of _all_);

      end;

      varmon.output(dataset:"want");

      stop;

run;

Frequent Contributor
Frequent Contributor
Posts: 78

Re: Proc Transpose Alternative

Perhaps you can cobble something out of this....

It's a Bird, It's a Plane, It's SQL Transpose!
http://www2.sas.com/proceedings/forum2008/089-2008.pdf
Ever wish you could easily transpose and summarize monthly data using just PROC SQL? Regardless of whether the data is stored in a SAS dataset or other relational databases? Even if scores of variables for dozen of months are involved? This paper tackles the task of data denormalization with a simple macro that uses PROC SQL to take a "vertical" table with multiple rows per subject and transform it into a "horizontal" table containing one row per subject with aggregated monthly values.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 16 replies
  • 1106 views
  • 6 likes
  • 10 in conversation