Calcite | Level 5

## merge panel data with cross-section

I have two data sets which look like this:

id1   id2   Z

1   1   z11

1   2   z12

1   3   z13

2   1   z21

2   2   z22

2   3   z23

3   1   z31

3   2   z32

3   3   z33

and

id   X   Y

1   x1   y1

2   x2   y2

3   x3   y3

I am new to SAS but my programming skills are very good when it comes to Stata and R.

Can someone please help me write a code in SAS that will create a new data set as such:

id   W

1   w1

2   w2

3   w3

where w1 = (z11*x1+z12*x2+z13*x3) / (z11*y1+z12*y2+z13*y3)

Forget about the fact that you can factor out the z's. The weighting doesn't exactly work like that, I've just simplified it.

Thank you!

ps. I have asked the exact same question on stackexchange before realizing there is a SAS community here. I hope that's okay.

9 REPLIES 9
Super User

## Re: merge panel data with cross-section

AFAIK there is no issue with cross posting. But if you do receive an answer in one it would be polite to mark the question answered in both and put the answer in both locations.

Here's one way using temporary arrays. I'm not sure it's correct given your SO code which makes me wonder if the process is more involved, but perhaps it will help you get started. Although Table2 has an ID variable it doesn't appear to be an ID, more of a row number, is that correct?

data t1;

input id1 id2 Z ;

cards;

1   1   8

1   2   9

1   3   10

2   1   11

2   2   12

2   3   13

3   1   14

3   2   15

3   3   16

;

run;

data t2;

input id x  y ;

cards;

1   1   4

2   2   5

3   3   5

;

run;

proc transpose data=t1 out=t1_f prefix=z;

by id1;

var z;

id id2;

run;

data want;

*load factors into temporary arrays to multiply;

if _n_=1 then do i=1 to 3;

set t2;

array _x(3) _temporary_;

array _y(3) _temporary_;

_x(i)=x;

_y(i)=y;

end;

set t1_f;

array z(3) z1-z3;

*initialize to 0;

num=0;

den=0;

*Calculate numerator/denominator;

do i=1 to 3;

num=num+z(i)*_x(i);

den=den+z(i)*_y(i);

end;

*Calculate final weight;

w=num/den;

run;

Calcite | Level 5

## Re: merge panel data with cross-section

Thank you very much Reeza for such a quick response.

Would this wirj if the data set is very large? I have 20,000 panels with about 20,000 obs in each panel.

Calcite | Level 5

## Re: merge panel data with cross-section

By the way, this is the code I already have but I don't think it's optimized for a large data set (20,000 x 20,000) since it creates a temporary table for each panel. I did that because I didn't want to lose the work already done in case the computer shuts down or something I can start where I left off basically.

Calcite | Level 5

## Re: merge panel data with cross-section

Here is the code:

`%macro ratio_m(id);proc sql;select id1, sum(x*z) as wx  from table2 as P right join table1 as N  on input(N.id2,8.)=P.id  group by id1;create table temp2asselect orig_id, sum(y*z) as wy  from table2 as P right join table1 as N  on input(N.id2,8.)=P.id  group by id1;%if %sysfunc(exist(final)) %then %do;  drop table final;%end;create table finalasselect &i as indexRow, one.*, two.*, (wx/wy) as ratio  from temp1 as one right join temp2 as two  on one.id1=two.id1;quit;%mend;%macro main;  %if %sysfunc(exist(Index_dsn))=1 %then %do;  %if %sysfunc(exist(final_overall))=1 %then %do;  proc sql noprint;  select max(indexRow) into :indexRow from final_overall  ;%let indexRow=%sysfunc(sum(&indexRow,1)); %put &indexRow;  select max(indexRow) into :indexFinal from Index_dsn  ;%let indexFinal=&indexFinal; %put &indexFinal;  ;quit;  %put Previous calculation ends on row &indexRow ;  %end;  %else %if %sysfunc(exist(final_overall))=0 %then %do;  %put "First Iteration: Dataset final_overall does not exist";  proc sql noprint;  select max(indexRow) into :indexFinal from Index_dsn;  ;  create table final_overall  (indexRow num  ,id1 char(10)  ,ratio num);  ;quit;  %let indexFinal=&indexFinal; %let indexRow=1;  %end;  %end;  %else %if %sysfunc(exist(Index_dsn))=0 %then %do;  %put "Datasets Index_dsn and final_overall dont exist";  proc sql noprint;  create table Index_dsn AS   select distinct id1 from table1;  ;quit;  data Index_dsn; set Index_dsn; indexRow+1; run;  proc sql noprint;  select max(indexRow) into :indexFinal from Index_dsn;  ;  %let indexFinal=&indexFinal; %let indexRow=1;  create table final_overall  (indexRow num  ,id1 char(10)  ,ratio num);  ;quit;  %end;  %do i=&indexRow %to &indexFinal;  proc sql;  select id1 into :id from Index_dsn where indexRow=&i  ; %let id=&id;  quit;  %put &id;  %ratio_m (&id);  proc append base=.final_overall data=final force; run;  %end;%mend;options nomlogic;%main`

Super User

## Re: merge panel data with cross-section

Perhaps post what you have originally and desire instead, that looks overly complicated.

For weight calculations you may be better off using a combination of proc means with the weight statement or if you really need matrix math then going into IML.

Super User

## Re: merge panel data with cross-section

Yeah. Agree with Reeza.

Since you have good programming skill about R , you should take a look at SAS/IML code, because They are all Matrix language .

And Better Post it at IML forum . I saw you are doing some INNER product of matrix , that is matrix operation . Not good for data step, but good for IML and R. The result is in table WANT .

### Code: Program

`data t1;input id1 id2 Z ;cards;1 1 81 2 91 3 102 1 112 2 122 3 133 1 143 2 153 3 16;run;  data t2;input id x  y ;cards;1 1 42 2 53 3 5;run;proc iml;use t1;read all var _all_ into z;close;use t2;read all var {x y} into xy;close;zz=full(z[,{3 1 2}]);inner_product=zz*xy;id=1:nrow(inner_product);want=id`||(inner_product[,1]/inner_product[,2]);create want from want[c={id w}];append from want;close;quit;`

Xia Keshan

Calcite | Level 5

## Re: merge panel data with cross-section

Thank you all. I don't think the IML solutions work with the size of my data set which is not very wide(4-5 variables) but is very long (20,000 panels and 20,000 obs in each roughly, not balanced).

Super User

## Re: merge panel data with cross-section

Did you run my IML code. This code is designed for

"(4-5 variables) but is very long (20,000 panels and 20,000 obs "

Super User

## Re: merge panel data with cross-section

One thing that's nice about SAS is that the size of datasets, minus mismatched matrices, don't matter too match.  The same code that works on 100 obs works on 1000000,

Discussion stats
• 9 replies
• 2259 views
• 0 likes
• 3 in conversation