Dear users,
i would like to replicate the result ( column want ) of my Excel example below .
any help on it i tried many ways , maybe the code below is close to get the right results.
age | country | region | x | want |
1 | FR | Europe | 0,5 | 1 |
2 | FR | Europe | 0,6 | 0,5 |
3 | FR | Europe | 0,7 | 0,2 |
4 | FR | Europe | 0,8 | 0,06 |
5 | FR | Europe | 0,9 | 0,012 |
1 | US | America | 0,1 | 1 |
2 | US | America | 0,2 | 0,9 |
3 | US | America | 0,3 | 0,72 |
4 | US | America | 0,4 | 0,504 |
5 | US | America | 0,5 | 0,3024 |
data want;
set have;
by country region age;
if first.country then y=1;
else y=lag(y)*(1-lag(x));
run;
Why are you calling LAG(Y)? That neither makes any sense if Y is not one of your source variables nor if you want to use the retained value in the calculation.
data want;
set have;
by country region age;
y=y*(1-lag(x));
if first.country then y=1;
retain y;
run;
@bebess wrote:
...
data want; set have; by country region age; if first.country then y=1; else y=lag(y)*(1-lag(x)); run;
That is not going to work right. By not calling the LAG() function for the first value of each COUNTRY the values of X and Y from that observation never make it into the list of values it has available to return.
Do this instead so that LAG() runs on every observation.
y=lag(y)*(1-lag(x));
if first.country then y=1;
thanks for your help ,
the proposed code give me the following result
age | country | region | x | want | y |
1 | FR | Europe | 0,5 | 1 | 1 |
2 | FR | Europe | 0,6 | 0,5 | |
3 | FR | Europe | 0,7 | 0,2 | |
4 | FR | Europe | 0,8 | 0,06 | |
5 | FR | Europe | 0,9 | 0,012 | |
1 | US | America | 0,1 | 1 | 1 |
2 | US | America | 0,2 | 0,9 | |
3 | US | America | 0,3 | 0,72 | |
4 | US | America | 0,4 | 0,504 | |
5 | US | America | 0,5 | 0,3024 |
data want;
set have;
by country region age;
y=lag(y)*(1-lag(x));
if first.country then y=1;
run;
when i tried retain as below i've got wrong result comparing to column want
age | country | region | x | want | y |
1 | FR | Europe | 0,5 | 1 | 1 |
2 | FR | Europe | 0,6 | 0,5 | 0,5 |
3 | FR | Europe | 0,7 | 0,2 | 0,4 |
4 | FR | Europe | 0,8 | 0,06 | 0,15 |
5 | FR | Europe | 0,9 | 0,012 | 0,08 |
1 | US | America | 0,1 | 1 | 1 |
2 | US | America | 0,2 | 0,9 | 0,072 |
3 | US | America | 0,3 | 0,72 | 0,8 |
4 | US | America | 0,4 | 0,504 | 0,0504 |
5 | US | America | 0,5 | 0,3024 | 0,48 |
data want;
set have;
by country region age;
retain y 1;
y=lag(y)*(1-lag(x));
if first.country then y=1;
run;
Why are you calling LAG(Y)? That neither makes any sense if Y is not one of your source variables nor if you want to use the retained value in the calculation.
data want;
set have;
by country region age;
y=y*(1-lag(x));
if first.country then y=1;
retain y;
run;
data have;
input age country $ region $ x;
datalines;
1 FR Europe 0.5
2 FR Europe 0.6
3 FR Europe 0.7
4 FR Europe 0.8
5 FR Europe 0.9
1 US America 0.1
2 US America 0.2
3 US America 0.3
4 US America 0.4
5 US America 0.5
;
proc sort data=have;
by country region age;
data want (drop=prev_y prev_x);
set have;
by country region;
retain prev_x prev_y;
if first.region then y = 1;
else y = prev_y * (1 - prev_x);
prev_y = y;
prev_x = x;
run;
proc print data=want noobs;
** Results **;
age country region x y
1 FR Europe 0.5 1.0000
2 FR Europe 0.6 0.5000
3 FR Europe 0.7 0.2000
4 FR Europe 0.8 0.0600
5 FR Europe 0.9 0.0120
1 US America 0.1 1.0000
2 US America 0.2 0.9000
3 US America 0.3 0.7200
4 US America 0.4 0.5040
5 US America 0.5 0.3024
This code explicitly uses retain to keep the previous values. I find it simpler to know exactly which variables are being kept.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: