BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Anon123
Fluorite | Level 6

I have two large datasets. Dataset 1 with returns. Dataset 2 with 0 or 1. I need to have an output where I calculate the average of the returns by row.

 

A mini example could be:

 

N=2

 

DATASET 1                                             DATASET 2                                               Average (Output)

        Var1       Var2       Var3                                   Var1       Var2       Var3

1       0.5         0.2          0.1                          1          1             0            1                       (0.5+0.1)/N

2       0.1         0.2          0.0                          2          1             1            0                       (0.1+0.2)/N

3       0.2         0.5          0.3                          3          0             1            1                       (0.5+0.3)/N

 

How do I do this?

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

An ID variable tells SAS which rows to match from A with rows from B.

 

1. Merge datasets by ID variable - you can't have the same variable names in each so you'll need to rename them.

2. Declare Arrays for each

3. Mulitple/Average

 

It would have been better if you'd explicitly written out your formula as

0.5*1 + 0.2*0 + 0.1*1/N

 

What is N by the way, 2 or 3?

 

data want;
merge a b(rename=var1-var3=ind1-ind3);
by ID;

array table1(*) var1-var3;
array table2(*) ind1-ind3;
total=0;
n=0; do i=1 to dim(table1); total=table1(i)*table2(*) + total; n=n+1; end; average=total/n; run;

View solution in original post

7 REPLIES 7
LinusH
Tourmaline | Level 20
First join the two data sets. I can't see an id variable but I guess you left that our to male the example as simple as possible...
The You could use a two dimensional array to make your program flexible (if in the future more variables will be used). Of use hard coded variable name to the calculation sum(var1_ds1*var1_ds2, var2 etc) / N
Data never sleeps
Anon123
Fluorite | Level 6
I'm not sure what an id variable is.

Can you make an example of a dynamic code?
Reeza
Super User

An ID variable tells SAS which rows to match from A with rows from B.

 

1. Merge datasets by ID variable - you can't have the same variable names in each so you'll need to rename them.

2. Declare Arrays for each

3. Mulitple/Average

 

It would have been better if you'd explicitly written out your formula as

0.5*1 + 0.2*0 + 0.1*1/N

 

What is N by the way, 2 or 3?

 

data want;
merge a b(rename=var1-var3=ind1-ind3);
by ID;

array table1(*) var1-var3;
array table2(*) ind1-ind3;
total=0;
n=0; do i=1 to dim(table1); total=table1(i)*table2(*) + total; n=n+1; end; average=total/n; run;
KachiM
Rhodochrosite | Level 12
What is your N for Large Datasets?
rogerjdeangelis
Barite | Level 11

 OT

 

I provide a SAS/IML, R and WPS solution for moderate size datasets.

For huge datasets the array method may be more appropriate.

 

see

https://listserv.uga.edu/cgi-bin/wa?A2=SAS-L;5d05f7f.1607d

Reeza
Super User

@rogerjdeangelis You have to log in for that link, is there a link that doesn't require a log in?

 

EDIT:

I'm not very knowledgable about performance, but I think IML would be faster than arrays.

 

 

rayIII
SAS Employee

Here is an IML solution (it uses elementwise multiplication).

 

 


data dataset1;
input var1 var2 var3; 
datalines;
.5 .2 .1
.1 .2 0
.2 .5 .3
;



data dataset2;
input var1 var2 var3; 
datalines;
1 0 1
1 1 0 
0 1 1 
;



proc iml;


	use dataset1; 
	read all into ds1; 

	n = nrow(ds1);

	use dataset2; 
	read all into ds2; 

	elementProd = ds1#ds2;

	want = elementProd[,+] / n ; 

	print want; 


quit;

 

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2717 views
  • 2 likes
  • 6 in conversation