BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Anon123
Fluorite | Level 6

I have two large datasets. Dataset 1 with returns. Dataset 2 with 0 or 1. I need to have an output where I calculate the average of the returns by row.

 

A mini example could be:

 

N=2

 

DATASET 1                                             DATASET 2                                               Average (Output)

        Var1       Var2       Var3                                   Var1       Var2       Var3

1       0.5         0.2          0.1                          1          1             0            1                       (0.5+0.1)/N

2       0.1         0.2          0.0                          2          1             1            0                       (0.1+0.2)/N

3       0.2         0.5          0.3                          3          0             1            1                       (0.5+0.3)/N

 

How do I do this?

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

An ID variable tells SAS which rows to match from A with rows from B.

 

1. Merge datasets by ID variable - you can't have the same variable names in each so you'll need to rename them.

2. Declare Arrays for each

3. Mulitple/Average

 

It would have been better if you'd explicitly written out your formula as

0.5*1 + 0.2*0 + 0.1*1/N

 

What is N by the way, 2 or 3?

 

data want;
merge a b(rename=var1-var3=ind1-ind3);
by ID;

array table1(*) var1-var3;
array table2(*) ind1-ind3;
total=0;
n=0; do i=1 to dim(table1); total=table1(i)*table2(*) + total; n=n+1; end; average=total/n; run;

View solution in original post

7 REPLIES 7
LinusH
Tourmaline | Level 20
First join the two data sets. I can't see an id variable but I guess you left that our to male the example as simple as possible...
The You could use a two dimensional array to make your program flexible (if in the future more variables will be used). Of use hard coded variable name to the calculation sum(var1_ds1*var1_ds2, var2 etc) / N
Data never sleeps
Anon123
Fluorite | Level 6
I'm not sure what an id variable is.

Can you make an example of a dynamic code?
Reeza
Super User

An ID variable tells SAS which rows to match from A with rows from B.

 

1. Merge datasets by ID variable - you can't have the same variable names in each so you'll need to rename them.

2. Declare Arrays for each

3. Mulitple/Average

 

It would have been better if you'd explicitly written out your formula as

0.5*1 + 0.2*0 + 0.1*1/N

 

What is N by the way, 2 or 3?

 

data want;
merge a b(rename=var1-var3=ind1-ind3);
by ID;

array table1(*) var1-var3;
array table2(*) ind1-ind3;
total=0;
n=0; do i=1 to dim(table1); total=table1(i)*table2(*) + total; n=n+1; end; average=total/n; run;
KachiM
Rhodochrosite | Level 12
What is your N for Large Datasets?
rogerjdeangelis
Barite | Level 11

 OT

 

I provide a SAS/IML, R and WPS solution for moderate size datasets.

For huge datasets the array method may be more appropriate.

 

see

https://listserv.uga.edu/cgi-bin/wa?A2=SAS-L;5d05f7f.1607d

Reeza
Super User

@rogerjdeangelis You have to log in for that link, is there a link that doesn't require a log in?

 

EDIT:

I'm not very knowledgable about performance, but I think IML would be faster than arrays.

 

 

rayIII
SAS Employee

Here is an IML solution (it uses elementwise multiplication).

 

 


data dataset1;
input var1 var2 var3; 
datalines;
.5 .2 .1
.1 .2 0
.2 .5 .3
;



data dataset2;
input var1 var2 var3; 
datalines;
1 0 1
1 1 0 
0 1 1 
;



proc iml;


	use dataset1; 
	read all into ds1; 

	n = nrow(ds1);

	use dataset2; 
	read all into ds2; 

	elementProd = ds1#ds2;

	want = elementProd[,+] / n ; 

	print want; 


quit;

 

 

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 4222 views
  • 2 likes
  • 6 in conversation