Help using Base SAS procedures

multiplying two datasets

Reply
Frequent Contributor
Frequent Contributor
Posts: 79

multiplying two datasets

Dear all,

I have a dataset which is a covariance matrix

looks like the below

covout =

type   _name_  a        b    c

COV    a         0.103  etc

COV    b

COV    c

I have a weights file which is

date id weights

05/31/2013  a  0.3

05/31/2013 b 0.4

05/31/2013 c 0.3

I want to compute the following measure =

sqrt(weights* cov* weights')

not sure how to do this in SAs

how do I multiply the cov with the weights while matching the ids

Thanks so much!

PROC Star
Posts: 7,471

Re: multiplying two datasets

where do weights' come from?  I don't see them in your data.

Super Contributor
Posts: 307

Re: multiplying two datasets

In your COV dataset, you have a _NAME_ variable with values a, b, c . . .  You also have columns named a, b, c . . .

In your WEIGHTS file, you have ID with values a, b, c . . .

Which variable in COV will be used to match to WEIGHTS?

Super Contributor
Posts: 334

Re: multiplying two datasets

I dont think there is an easy way to do this if you truly want to multiple the two matrices in your example using base SAS (if you have IML then it becomes easier).

I would check this article out to see if it helps you.

http://www.lexjansen.com/pharmasug/2010/cc/cc15.pdf

EJ

PROC Star
Posts: 7,471

Re: multiplying two datasets

Are you looking for something like the following (I used the sashelp.class covariance matrix as an example):

data covout;

  input type $  _name_ $ height weight age;

  cards;

COV    height    26.287  102.493   6.2099

COV    weight   102.493  518.652  25.1857

COV    age        6.2099  25.1857  2.2281

;

data weights;

  informat date mmddyy10.;

  input date id $ weights;

  cards;

05/31/2013 height  0.3

05/31/2013 weight  0.4

05/31/2013 age     0.3

;

proc transpose data=weights (drop=date) out=weights_wide suffix=_weight;

  id id;

run;

proc transpose data=weights (drop=date) out=cov_wide suffix=_weighted_cov;

  id id;

run;

data want;

  if _n_ eq 1 then do;

    set weights_wide;

    set cov_wide;

  end;

  set covout;

  array cov (*) height--age;

  array weights(*) height_weight--age_weight;

  array weighted_cov(*) height_weighted_cov--age_weighted_cov;

  do i=1 to dim(cov);

    weighted_cov(i)=sqrt(weights(_n_)*cov(i)*weights(i));

  end;

run;

Frequent Contributor
Frequent Contributor
Posts: 79

Re: multiplying two datasets

I think at the end it should be a number

weights is a dataset as well

Sorry also I forgot to specify that the _name_ and id are the same and will be used to match the cov and the weights

Frequent Contributor
Frequent Contributor
Posts: 79

Re: multiplying two datasets

since

weights are 1x3 and cov is 3x3 and weights' is 3x1 =  1x1

Thanks!

Respected Advisor
Posts: 3,799

Re: multiplying two datasets

Multiplying data sets is what PROC SCORE does.  It took me a bit of fiddling to get it figured but I think this is it.

Both data sets COV and weights(I call COEFF) are SCORE data sets.  In step 1 COV is used to score data=COEF SCORE=COV and in step 2 the output from STEP 1 is scored using SCORE=COEFF.

proc corr data=sashelp.class out=cov(where=(_type_='COV')) cov nocorr noprint;
  
var height weight age;
   run;
data cov2;
   set cov;
   _type_='SCORE';
  
run;
proc print;
  
run;


data coeff;
   if 0 then set cov;
   retain _type_ 'SCORE';
  
input _name_ $ height weight age;
  cards;
x  .3 .4 .3
;;;;
   run;
proc print;
  
run;


proc score data=coeff score=cov2 out=step1;
  
var height weight age;
   id _name_;
   run;
proc print;
  
run;
proc score data=step1 score=coeff out=step2;
   var height weight age;
   id _name_;
   run;
proc print;
  
run;
data step3;
   set step2;
   s = sqrt(x);
  
run;
proc print;
  
run;
Frequent Contributor
Frequent Contributor
Posts: 79

Re: multiplying two datasets

Thanks very much but if you have many variables and not just height weight age how would one handle that

Thank you so much

PROC Star
Posts: 7,471

Re: multiplying two datasets

You didn't mention which post gave you what you were looking for.  I'll presume it was data_null's, thus revised my code to match his.  In both cases, the variables height weight age can be replaced with a variable list (i.e., height--age) as shown below:

/*art's method*/

data covout;

  input type $  _name_ $ height weight age;

  cards;

COV    height    26.287  102.493   6.2099

COV    weight   102.493  518.652  25.1857

COV    age        6.2099  25.1857  2.2281

;

data weights;

  informat date mmddyy10.;

  input date id $ weights;

  cards;

05/31/2013 height  0.3

05/31/2013 weight  0.4

05/31/2013 age     0.3

;

proc transpose data=weights (drop=date) out=weights_wide suffix=_weight;

  id id;

run;

proc transpose data=weights (drop=date) out=cov_wide suffix=_weighted_cov;

  id id;

run;

data want (keep=weighted_cov sqrt_weighted_cov);

  if _n_ eq 1 then do;

    set weights_wide;

    set cov_wide;

    weighted_cov=0;

  end;

  set covout end=last;

  array cov (*) height--age;

  array weights(*) height_weight--age_weight;

  retain weighted_cov;

  do i=1 to dim(cov);

    weighted_cov=sum(weighted_cov,weights(_n_)*cov(i)*weights(i));

  end;

  if last then do;

    sqrt_weighted_cov=sqrt(weighted_cov);

    output;

  end;

run;

/*data_null's method*/

proc corr data=sashelp.class out=cov(where=(_type_='COV')) cov nocorr noprint;

   var height weight age;

   run;

data cov2;

   set cov;

   _type_='SCORE';

   run;

data coeff;

   if 0 then set cov;

   retain _type_ 'SCORE';

   input _name_ $ height weight age;

  cards;

x  .3 .4 .3

;;;;

   run;

proc score data=coeff score=cov2 out=step1;

   var height--age;

   id _name_;

   run;

proc score data=step1 score=coeff out=step2;

   var height--age;

   id _name_;

   run;

data step3;

   set step2;

   s = sqrt(x);

   run;

Respected Advisor
Posts: 3,799

Re: multiplying two datasets

Consider this.

proc corr data=sashelp.heart out=cov(where=(_type_ eq 'COV')) cov noprint;
  
run;
data cov;
   set cov;
   _type_ = 'SCORE';
  
run;
proc print;
  
run;
data coeff; *random weights;
  
set cov(keep=_type_ _name_);
   x = ranuni(4);
   run;
proc transpose data=coeff out=coeff;
   var x;
   by _type_;
   run;
proc print;
  
run;
proc score data=coeff score=cov out=step1;
  
id _name_;
   run;
proc score data=step1 score=coeff out=step2;
   id _name_;
   run;
data step3;
   set step2;
   s = sqrt(x);
  
run;
proc print;
  
run;
Ask a Question
Discussion stats
  • 10 replies
  • 657 views
  • 2 likes
  • 5 in conversation