Solved: Re: Proc Transpose Alternative

bnarang · Posted 12-04-2012 01:41 PM

Hi Everyone

While Transposing any vertical format dataset with proc transpose, I believe we cannot give multiple ID variables. But my problem is similar. With the By group I would like to have data with multiple ID variable to check my data. What I understood, I need to write multiple transpose procedures and then merge back. However, I am sure it might not be very efficient way to work on such problem. Please shed light on this if you know anything alternative that might be useful.

Thanks

data_null__ · Posted 12-05-2012 02:14 PM

Given what you have shown with have and need I would do this. It address the naming of the new variables along with ordering. Your transpose variables are all the same type so that is not an issue. This will work well until your data gets large when it will be slow.

data have;
   input ID $  Month:yymmn. Var1 $  Var2 $;
   format month yymmn.; 
   cards; 
1     200101     SD      DE
1     200102     EF       GH
1     200103     HK       LM
2     200101     GH       LM
2     200105     JH       KJ
;;;;
   run; 
proc summary data=have nway; 
   output out=month(drop=_:) min(month)=min max(month)=max; 
   run; 
proc transpose data=have(obs=0) out=vars;
   var var:;
   run; 
data frame(keep=_name_ month);
   set vars;
   if _n_ eq 1 then set month;
   do m = 0 to intck('MONTH',min,max);
      month = intnx('MONTH',min,m);
      output; 
      end; 
   format month yymmn.; 
   run; 

proc transpose data=have out=tall;
   by id month;
   var var:;
   run; 
data tallV / view=tallV;
   set frame tall;
   run; 
proc transpose data=tallV out=wide(where=(not missing(ID))) delim=_;
   by id;
   var col1;
   id _name_ month;
   run;

View solution in original post

Linlin · Posted 12-04-2012 01:48 PM

Hi,

You could either use array or proc summary.

Tom · Posted 12-04-2012 01:55 PM

You can use multiple variables in the ID statement. Here is an simple example.

data tall;

do year=2011 to 2012; do qtr=1 to 4;

input count @@ ; output;

end; end;

cards;

1 2 3 4 5 6 7 8

run;

proc transpose data=tall out=wide prefix=Y delim=Q;

id year qtr;

var count;

run;

proc print data=wide; run;

Perhaps you want to transpose multiple VAR statements into columns rather than rows?

There is trick using the IDGROUP feature of PROC SUMMARY that DATA_NULL_ has posted here and on SAS-L.

Ksharp · Posted 12-04-2012 10:30 PM

One choice is proc means + idgroup , check it at document.

But I will use data step to catch it which is very fast.

Ksharp

bnarang · Posted 12-05-2012 02:58 AM

Hi All,

I looked for Proc means + idgroup statement as suggested by all, but I got error: I refered to this document on the web to do the same:

http://www.sascommunity.org/wiki/Transpose_using_IDGROUP_in_PROC_SUMMARY

"The parameter value 139 is not within the required range of 1 and 100" .

Please help on this,

data_null__ · Posted 12-05-2012 05:49 AM

139 is not going to work. The value in the OUT operand of the IDGROUP option must be between 1 and 100.

Can you explain your requirement in more detail? Also, post some sample data using the have/need model.

bnarang · Posted 12-05-2012 12:34 PM

Hi Data_null, So basically, I have a dataset like this:

ID Month Var1 Var2

1 200101 SD DE

1 200102 EF GH

1 200103 HK LM

2 200101 GH LM

2 200105 JH KJ

Now Here I would like to change the format of the current dataset in the following way:

ID Var1_200101 Var1_200102 Var1_200103 Var1_200104 Var1_200105 Var2_200101 Var2_200102 Var2_200103 Var2_200104 Var2_200105

1 SD EF HK . . DE GH LM . .

2 GH . . JH LM KJ

This can be achieved by writing multiple Transpose procedures and then Merge back with ID. 2nd way as mentioned above by Proc Means. It worked well but in cases when IDGROUP option is not within 1 - 100 , this will not work which is my case unfortunately.

Thanks

Reeza · Posted 12-05-2012 12:56 PM

I'm assuming your column identifiers will follow a date pattern, ie 201001-201012 201101-201112 and so forth.

Do you know the time period in advance or does that change? Is it set for each run?

bnarang · Posted 12-06-2012 02:13 AM

Hi Reeza,

No, the time period is not set. It can be pre-populated dataset

Linlin · Posted 12-05-2012 01:35 PM

I would first split the dataset into two, then use DN's proc summary method, then merge the final two datasets.

data_null__ · Posted 12-05-2012 02:14 PM

Given what you have shown with have and need I would do this. It address the naming of the new variables along with ordering. Your transpose variables are all the same type so that is not an issue. This will work well until your data gets large when it will be slow.

data have;
   input ID $  Month:yymmn. Var1 $  Var2 $;
   format month yymmn.; 
   cards; 
1     200101     SD      DE
1     200102     EF       GH
1     200103     HK       LM
2     200101     GH       LM
2     200105     JH       KJ
;;;;
   run; 
proc summary data=have nway; 
   output out=month(drop=_:) min(month)=min max(month)=max; 
   run; 
proc transpose data=have(obs=0) out=vars;
   var var:;
   run; 
data frame(keep=_name_ month);
   set vars;
   if _n_ eq 1 then set month;
   do m = 0 to intck('MONTH',min,max);
      month = intnx('MONTH',min,m);
      output; 
      end; 
   format month yymmn.; 
   run; 

proc transpose data=have out=tall;
   by id month;
   var var:;
   run; 
data tallV / view=tallV;
   set frame tall;
   run; 
proc transpose data=tallV out=wide(where=(not missing(ID))) delim=_;
   by id;
   var col1;
   id _name_ month;
   run;

art297 · Posted 12-05-2012 06:15 PM

DN: As always I'm impressed with your solution. The only thing I would suggest adding is a drop option to omit the _name_ variable in the final data step. i.e.:

proc transpose data=tallV out=wide(drop= _: where=(not missing(ID))) delim=_;

by id;

var col1;

id _name_ month;

run;

Ksharp · Posted 12-05-2012 10:27 PM

I would do this.

data have;
   input ID $  Month:yymmn. Var1 $  Var2 $;
   format month yymmn.;
   cards;
1     200101     SD      DE
1     200102     EF       GH
1     200103     HK       LM
2     200101     GH       LM
2     200105     JH       KJ
;;;;
   run;
proc sql noprint;
 select distinct catt('have(where=(month=',month,') rename=(var1=var1_',put(month,yymmn. -l),' var2=var2_',put(month,yymmn. -l),'))') into : list separated by ' '
  from have;
quit;
data want;
 merge &list ;
 by id;
run;

Ksharp

joehinson · Posted 12-06-2012 09:15 AM

A hash-transpose. Not so elegant; way too many hard-coding of variables! Also, dates were handled as character to simplify matters. The hash variables could have been handled with macro variables, through a prior Proc SQL.

data _null_;

if(1=2)then set have;

length var1_200101-var1_200105 var2_200101-var2_200105 $2 ;

declare hash varmon (ordered:"a");

varmon.defineKey("id");

varmon.defineData ("id","Var1_200101","Var1_200102","Var1_200103","Var1_200104","Var1_200105",

"Var2_200101","Var2_200102","Var2_200103","Var2_200104","Var2_200105");

varmon.defineDone();

call missing (of _all_);

array w{5,2} $ _temporary_;

do until(done);

set have end=done;

rcf=varmon.find(key:id);

w(1,1)=Var1_200101;w(2,1)=Var1_200102;w(3,1)=Var1_200103;w(4,1)=Var1_200104;w(5,1)=Var1_200105;

w(1,2)=Var2_200101;w(2,2)=Var2_200102;w(3,2)=Var2_200103;w(4,2)=Var2_200104;w(5,2)=Var2_200105;

mx=whichc(month,"200101","200102","200103","200104","200105");

w(mx,1)=var1;

w(mx,2)=var2;

rcr=varmon.replace(key:id,data:id,data:w(1,1),data:w(2,1),data:w(3,1),data:w(4,1),data:w(5,1),

data:w(1,2),data:w(2,2),data:w(3,2),data:w(4,2),data:w(5,2));

call missing(of _all_);

end;

varmon.output(dataset:"want");

stop;

run;

tc · Posted 12-06-2012 10:30 AM

Perhaps you can cobble something out of this....

It's a Bird, It's a Plane, It's SQL Transpose!
http://www2.sas.com/proceedings/forum2008/089-2008.pdf
Ever wish you could easily transpose and summarize monthly data using just PROC SQL? Regardless of whether the data is stored in a SAS dataset or other relational databases? Even if scores of variables for dozen of months are involved? This paper tackles the task of data denormalization with a simple macro that uses PROC SQL to take a "vertical" table with multiple rows per subject and transform it into a "horizontal" table containing one row per subject with aggregated monthly values.

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away