BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ss59
Obsidian | Level 7

Hi all, I have the following two data sets:

 Data Set 1 with probabilities

 

key 1key 2key 3key 4key 5pr_st1-2pr_st1-3pr_st1-1pr_st2-1pr_st2-3pr_st2-2pr_st3-1pr_st3-2pr_st3-3
France2017Approach ARetailRating 10.20.60.20.30.60.10.10.30.6
France2018Approach BCorporateRating 20.10.50.40.20.60.20.20.30.5

 

Data Set 2 with initial volumes:

 

key 1key 2key 3key 4key 5vol_st1vol_st2vol_st3
France2017Approach ARetailRating 1201012
France2018Approach BCorporateRating 22359

 

Both data has same keys; here key_set_1 = France 2017 Approach A Retail Rating 1. Dataset 1 has probabilities and dataset 2 has volumes.

The task is, I have to calculate ending volumes of each state.

i.e. 1. create a matrix for each row (with unique keys) in data set 1. So for row 1, it would be:

 

 state1state2state3
state10.20.20.6
state20.30.10.6
state30.10.30.6

 

2. create a vector of initial volumes

 

3. And then multiply the vector in data set 2 with those matrices from data set 1, i.e. multiply the initial volumes with probabilities to get the end volume for each set of unique keys. For this example (for the first row), it would be multiplication of 

 

201012

 with 

0.20.20.6
0.30.10.6
0.10.30.6


8.28.625.2

 

Thanks to the helpful folks from this forum, I have been able to partially achieve the first step, i.e. create the matrix for each row of data set 1. But, I am not sure how can I store the keys in an array and and store the outputs also corresponding to the same keys.  

 

in the end result we should be able to identify input 1 (3*3 matrix), input 2 (1*3 vector) and output (1*3 vector) with key_set_1. same for key_set_2, Key_set_3 and so on.

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
input Year (country method Segment) ( : $12.)
      pr_st1_2 pr_st1_3 pr_st1_1 pr_st2_1 pr_st2_3 pr_st2_2 pr_st3_1 pr_st3_2 pr_st3_3;
datalines;
2017 USA ABC Retail    0.2 0.6 0.2 0.3 0.6 0.1 0.1 0.3 0.6
2017 USA XYZ Corporate 0.1 0.5 0.4 0.2 0.6 0.2 0.2 0.3 0.5
;
run;
data key;
input country $  Year method $ Segment : $12.	vol_st1	vol_st2	vol_st3	;
cards;
USA	2017 ABC Retail 	 	20	10	12
USA	2017 XYZ Corporate	 	23	5	9
;
run;
data have;
merge have key;
by 	Year country method Segment	;
run;
 
proc sql noprint;
select name	into : vname separated by ' '
 from dictionary.columns
  where libname='WORK' and memname='HAVE' and lowcase(name) eqt 'pr_st'
   order by input(compress(scan(name,-2,'_'),,'kd'),best.),input(compress(scan(name,-1,'_'),,'kd'),best.);
quit;

proc iml;
use have nobs nobs;
read all var {&vname} into x;
read all var {vol_st1 vol_st2 vol_st3} into y;
close;
do i=1 to nobs;
temp=shape(x[i,],sqrt(ncol(x)));
want=(temp#t(y[i,]))[+,];
print want;
end;
quit;

View solution in original post

4 REPLIES 4
Ksharp
Super User
data have;
input Year (country method Segment) ( : $12.)
      pr_st1_2 pr_st1_3 pr_st1_1 pr_st2_1 pr_st2_3 pr_st2_2 pr_st3_1 pr_st3_2 pr_st3_3;
datalines;
2017 USA ABC Retail    0.2 0.6 0.2 0.3 0.6 0.1 0.1 0.3 0.6
2017 USA XYZ Corporate 0.1 0.5 0.4 0.2 0.6 0.2 0.2 0.3 0.5
;
run;
data key;
input country $  Year method $ Segment : $12.	vol_st1	vol_st2	vol_st3	;
cards;
USA	2017 ABC Retail 	 	20	10	12
USA	2017 XYZ Corporate	 	23	5	9
;
run;
data have;
merge have key;
by 	Year country method Segment	;
run;
 
proc sql noprint;
select name	into : vname separated by ' '
 from dictionary.columns
  where libname='WORK' and memname='HAVE' and lowcase(name) eqt 'pr_st'
   order by input(compress(scan(name,-2,'_'),,'kd'),best.),input(compress(scan(name,-1,'_'),,'kd'),best.);
quit;

proc iml;
use have nobs nobs;
read all var {&vname} into x;
read all var {vol_st1 vol_st2 vol_st3} into y;
close;
do i=1 to nobs;
temp=shape(x[i,],sqrt(ncol(x)));
want=(temp#t(y[i,]))[+,];
print want;
end;
quit;
IanWakeling
Barite | Level 11
want = y[i,]*temp;

would be more simple.

 

ss59
Obsidian | Level 7

Thanks a lot @Ksharp. Assuming the files would be big (500 different keys - 500 different rows), how do I get back the result for a particular set of key?

 

For example, if I run this piece of code it gives me two want matrices for two rows in the data, which is perfect. But if there were 500 rows and I wanted to extract the one with

2017 USA XYZ Corporate

, i.e. want[2], what could I have done? 

IanWakeling
Barite | Level 11

I think it would be best to create a single matrix large enough to hold all 500 results.  Then the matrix is filled up one row at a time for each iteration of the loop.   Afterwards results can be extracted by row sub setting.  The syntax would look like this:

 

n = sqrt(ncol(x));
want = j(nobs, n);

do i = 1 to nobs;
  temp = shape(x[i,], n);
  want[i,] = y[i,]*temp;
end;

w2 = want[2,];
print w2;

sas-innovate-2024.png

 

Secure your spot at the must-attend AI and analytics event of 2024: SAS Innovate 2024! Get ready for a jam-packed agenda featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events.

 

Register by March 1 to snag the Early Bird rate of just $695! Don't miss out on this exclusive offer. 

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 4 replies
  • 983 views
  • 8 likes
  • 3 in conversation