BookmarkSubscribeRSS Feed
yunfeizhao100
Calcite | Level 5

Hi, can you help me to see how to program it in SAS to calculate y for my below data? The data are already sorted by cust_id and period. The orginal data does not have y. The calculated y should look like the one in my data below. The requirement for y: y=x1 for the first period of a cust_id (this is how the first number of y, that is 1000 is calculated). y=x2 of the previous period multiply by x3 of the current period (this is how the second number of y, that is 6 is calculated: x2 of the previous period, that is 3, multiply by x3 of the current period, that is 2) I guess it might need retain or lag, but I am not sure.Untitled.png

2 REPLIES 2
Reeza
Super User
Your post may have been garbled by the forum. Please insert data and/or code using the little icons on the editor (running man or { i }). It helps if we can clearly see the data/question.
FreelanceReinh
Jade | Level 19

Hi @yunfeizhao100 and welcome to the SAS Support Communities!

 

It is very difficult for SAS to read data from pictures. (Maybe this is why no solution has been provided yet.)

 

With sample data in the form of a DATA step ...

data have;
input cust_id period x1-x3;
cards;
1 1 1000 3 3
1 2 1000 5 2
1 3 1000 6 4
1 4 1000 7 6
1 5 1000 3 2
2 1 2000 6 5
2 2 2000 4 6
2 3 2000 3 1
;

... testing solutions is much easier. A simple solution could look like this:

data want;
set have;
by cust_id;
prev_x2=lag(x2);
if first.cust_id then y=x1;
else y=prev_x2*x3;
drop prev_x2;
run;

For each observation the x2 value of the previous observation is stored in variable prev_x2. Then, y=x1 for the first observation of each cust_id BY group, and y=prev_x2*x3 for the remaining observations.

 

Actually, the same results can be obtained with more concise code:

data want;
set have;
by cust_id;
y=ifn(first.cust_id, x1, lag(x2)*x3);
run;

But this (without further precautions) would cause an undesirable note in the log:

NOTE: Missing values were generated as a result of performing an operation on missing values.
...

(due to the missing value of lag(x2) in the very first observation).

 

Both solutions assume that your sample data is representative of your real data, i.e.

  • The first observation of each cust_id BY group represents the first period of the respective cust_id. (Note that I didn't use variable period.)
  • Similarly, the second, third, ... observation of each cust_id BY group represents the second, third, ... period of the respective cust_id (so that "previous observation" and "previous period" are equivalent for these observations).

If these assumptions are not met or if x2 or x3 can have missing values (and you want to avoid those notes in the log about missing values), we'll need to modify the suggested code.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 1590 views
  • 0 likes
  • 3 in conversation