Say with data set A.
DATE ID
10 1
10 2
10 3
20 1
20 2
20 3
30 1
30 2
30 3
And data set B,
DATE ID x y
1 1 0.070933926 0.323055974
2 1 0.406570761 0.13760953
3 1 0.883535963 0.578733088
4 1 0.016231788 0.729134018
5 1 0.668424989 0.761197387
6 1 0.242259712 0.810040071
7 1 0.213543345 0.058647549
8 1 0.857330638 0.531532208
9 1 0.501064997 0.743060726
10 1 0.018354547 0.072328341
11 1 0.99627551 0.140234082
12 1 0.267813864 0.134323054
13 1 0.132534742 0.716864341
14 1 0.107630158 0.498794317
15 1 0.814832803 0.678847109
16 1 0.194916825 0.652946461
17 1 0.24655657 0.348632146
18 1 0.581696172 0.541978606
19 1 0.448507082 0.723949992
20 1 0.725078176 0.336086311
21 1 0.572441245 0.70316834
22 1 0.638282508 0.529384198
23 1 0.967807781 0.159573909
24 1 0.962864907 0.683835901
25 1 0.772681849 0.53964572
26 1 0.67532915 0.484330887
27 1 0.946038446 0.569370775
28 1 0.734410321 0.801076677
29 1 0.018660865 0.943685516
30 1 0.102126634 0.12693274
1 2 0.337826696 0.381279823
2 2 0.402967304 0.784483754
3 2 0.528027957 0.132032813
4 2 0.679449095 0.706947401
5 2 0.083379513 0.431491854
6 2 0.086233166 0.237205938
7 2 0.247418031 0.816204954
8 2 0.877926644 0.738031694
9 2 0.234003429 0.900287427
10 2 0.878105682 0.430906847
11 2 0.159710129 0.647456551
12 2 0.245602139 0.007026074
13 2 0.596617096 0.49183371
14 2 0.858081967 0.998086222
15 2 0.946591278 0.508973381
16 2 0.030718654 0.09659358
17 2 0.383475095 0.001080811
18 2 0.47175129 0.88797923
19 2 0.231755643 0.706321812
20 2 0.26818677 0.54529103
21 2 0.605350394 0.891914056
22 2 0.278178314 0.14754076
23 2 0.179958498 0.622006208
24 2 0.536177448 0.921842128
25 2 0.881522327 0.772927816
26 2 0.034479121 0.836198033
27 2 0.61827926 0.119682909
28 2 0.363028818 0.536296862
29 2 0.416057939 0.209719423
30 2 0.319097646 0.971274071
7 3 0.540217872 0.445921819
8 3 0.086111398 0.304460289
9 3 0.193343887 0.452840456
10 3 0.075337725 0.387945463
Baseically, I want to write a loop, what read one row of data A, and then find the 10 observations from data B with
A.date-10<B.date<=A.date and B.ID=A.ID,
and then get the regression coefficient with this regression and then put it back to data set A.
I have found some codes which can achieve this , but they are too costly, since they just collapse two data sets into one and then reg. However, my dataset is so big that it is impossible to do like that.
I am wondering if it will be possible to write a loop that go through all observations in data A, and for each observation of data A, find proper data from data B, and get regression results and put it back to data A, or just create a new data to store these results first.
Could someone help a rookie like me to write a program like this? 🙂 many kisses...
You can use SQL to get those obs firstly then feed it into PROC REG.
proc sql;
create table want as
select A.ID as group1, A.DATE as group2, B.*
from A,B
where A.date-10<B.date<=A.date and B.ID=A.ID ;
quit;
proc reg data=want;
by group1 group2;
model y=x ;
quit;
Thank you, Keshan. But the problem here is that the data is so big, so creating this 'want' data set is too costly.
What I get in mind is that, is it possible to everytime use one observation from A and then according to it, find proper data from B , and feed regression results back to A.
sth like this : https://communities.sas.com/t5/Base-SAS-Programming/Can-I-call-execute-macro-within-a-DO-loop/m-p/24...
I know what you are talking about. It is about CALL EXECUTE(). But I would doubt the efficiency if table is big.
data A;
input DATE ID ;
cards;
10 1
10 2
10 3
20 1
20 2
20 3
30 1
30 2
30 3
;
run;
data B;
input DATE ID x y;
cards;
1 1 0.070933926 0.323055974
2 1 0.406570761 0.13760953
3 1 0.883535963 0.578733088
4 1 0.016231788 0.729134018
5 1 0.668424989 0.761197387
6 1 0.242259712 0.810040071
7 1 0.213543345 0.058647549
8 1 0.857330638 0.531532208
9 1 0.501064997 0.743060726
10 1 0.018354547 0.072328341
11 1 0.99627551 0.140234082
12 1 0.267813864 0.134323054
13 1 0.132534742 0.716864341
14 1 0.107630158 0.498794317
15 1 0.814832803 0.678847109
16 1 0.194916825 0.652946461
17 1 0.24655657 0.348632146
18 1 0.581696172 0.541978606
19 1 0.448507082 0.723949992
20 1 0.725078176 0.336086311
21 1 0.572441245 0.70316834
22 1 0.638282508 0.529384198
23 1 0.967807781 0.159573909
24 1 0.962864907 0.683835901
25 1 0.772681849 0.53964572
26 1 0.67532915 0.484330887
27 1 0.946038446 0.569370775
28 1 0.734410321 0.801076677
29 1 0.018660865 0.943685516
30 1 0.102126634 0.12693274
1 2 0.337826696 0.381279823
2 2 0.402967304 0.784483754
3 2 0.528027957 0.132032813
4 2 0.679449095 0.706947401
5 2 0.083379513 0.431491854
6 2 0.086233166 0.237205938
7 2 0.247418031 0.816204954
8 2 0.877926644 0.738031694
9 2 0.234003429 0.900287427
10 2 0.878105682 0.430906847
11 2 0.159710129 0.647456551
12 2 0.245602139 0.007026074
13 2 0.596617096 0.49183371
14 2 0.858081967 0.998086222
15 2 0.946591278 0.508973381
16 2 0.030718654 0.09659358
17 2 0.383475095 0.001080811
18 2 0.47175129 0.88797923
19 2 0.231755643 0.706321812
20 2 0.26818677 0.54529103
21 2 0.605350394 0.891914056
22 2 0.278178314 0.14754076
23 2 0.179958498 0.622006208
24 2 0.536177448 0.921842128
25 2 0.881522327 0.772927816
26 2 0.034479121 0.836198033
27 2 0.61827926 0.119682909
28 2 0.363028818 0.536296862
29 2 0.416057939 0.209719423
30 2 0.319097646 0.971274071
7 3 0.540217872 0.445921819
8 3 0.086111398 0.304460289
9 3 0.193343887 0.452840456
10 3 0.075337725 0.387945463
;
run;
data _null_;
set A end=last;
call execute(catt('proc reg data=b(where=(',date-10,'<date<=',date,' and ID=',ID,'));model y=x;quit;'));
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.