BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Question
Fluorite | Level 6

Hi All,

I have a transaction table that contains all the transactions (orders) made by ecah customer. I would like to calculate the difference between orders to detect when they start reducing their orders. I have run the code below, but it doesn't give what I need...For the first customer, the calculation is right but when I get to the second customer (the first row is wrong because it calculates the difference between the last date of the first customer and the first date of the second customer). How can I tell SAS, to stop at the last date of the first customer and start again at the first date of the first customer.

I have tried to use First function but it didon't work.. Your help wouldf be much appreciated. Many Thanks

DATA MODELLING;

  SET MODELLING;

  /*DIFFERENCE TIME BETWEEN ORDERS*/

  BY SOLD_TO_ORG_ID DELIVERY_DATE;

/*  IF FIRST.SOLD_TO_ORG_ID AND FIRST.DELIVERY_DATE THEN DO;*/

  LAG_TIME = LAG(DELIVERY_DATE);

  DIF_TIME = DELIVERY_DATE - LAG_TIME;

  PER_TIME_INCREASE =(DIF_TIME/LAG_TIME)*100;

  LAG_VALUE = LAG(TOTAL_SPEND);

  DIF_VALUE = TOTAL_SPEND - LAG_VALUE;

  PER_VALUE_INCREASE =(DIF_VALUE/LAG_VALUE)*100;

/*  END;*/

/*  OUTPUT;*/

RUN;

1 ACCEPTED SOLUTION

Accepted Solutions
ErikT
Obsidian | Level 7

I don't understand the calculation of the increase/decrease, so I leave that to you. Taking the time-calculation: you divide the number of days between orders by the first date. That means that if they order exactly every 30 days you will see a decreasing result as the order date increases.

The calculation of the time between two purchases can be calculated as you do, but you have to make the calculated value missing every time you start with a new SOLD_TO_ORG_ID, like in the code below. By the way, you can use the DIF function to replace the combination of the LAG function and the subtraction.

data demo;

input SOLD_TO_ORG_ID DELIVERY_DATE : date. total_spend;

format delivery_date date.;

datalines;

1 12APR12  25

1 24apr12  71

1 13may12  321

1 04jun12  59

1 15jul12  77

2 19mar12  91

2 23may12  143

3 29mar12  86

3 6apr12   47

3 17apr12  66

3 16may12  91

3 21jun12  112

run;

data modeling;

set demo;

by SOLD_TO_ORG_ID DELIVERY_DATE ;

DIF_TIME = DIF(DELIVERY_DATE);

DIF_VALUE = dif(TOTAL_SPEND);

if first.sold_to_org_id then do;

   dif_time=.;

   dif_value=.;

end;

run;

ods listing;

options nocenter;

proc print;

run;



View solution in original post

7 REPLIES 7
LinusH
Tourmaline | Level 20

You are on the right track with the commented lines, using first. logic will solve it. But you need to use NOT first.

Data never sleeps
Question
Fluorite | Level 6

Hi LinusH,

Thanks for your reply...when you say I need to use NOT first, where exactly in the program?

IF FIRST.SOLD_TO_ORG_ID AND NOT FIRST.DELIVERY_DATE THEN DO;??

LinusH
Tourmaline | Level 20

Up to you to decide - for which rows do you want to calculate differences.

If only on the same day, you need to NOT both.

Data never sleeps
Question
Fluorite | Level 6

Hi LinusH,

I would like to calculate the difference for every row per customer. For example if a customer has 10 transactions row, I would like to calculate the diff between 1st order and 2nd order, the diff between 2nd order and 3rd order, and difference between 3rd and 4th order so on...And when I get to the second customer, I would like to follow the same logic. Does it make sense?

Many thanks

Alice

LinusH
Tourmaline | Level 20

Can't have any opinions on what you are calculating.

But as you describe it, you shouldn't test on date, just sold_to_org_id.

Data never sleeps
ErikT
Obsidian | Level 7

I don't understand the calculation of the increase/decrease, so I leave that to you. Taking the time-calculation: you divide the number of days between orders by the first date. That means that if they order exactly every 30 days you will see a decreasing result as the order date increases.

The calculation of the time between two purchases can be calculated as you do, but you have to make the calculated value missing every time you start with a new SOLD_TO_ORG_ID, like in the code below. By the way, you can use the DIF function to replace the combination of the LAG function and the subtraction.

data demo;

input SOLD_TO_ORG_ID DELIVERY_DATE : date. total_spend;

format delivery_date date.;

datalines;

1 12APR12  25

1 24apr12  71

1 13may12  321

1 04jun12  59

1 15jul12  77

2 19mar12  91

2 23may12  143

3 29mar12  86

3 6apr12   47

3 17apr12  66

3 16may12  91

3 21jun12  112

run;

data modeling;

set demo;

by SOLD_TO_ORG_ID DELIVERY_DATE ;

DIF_TIME = DIF(DELIVERY_DATE);

DIF_VALUE = dif(TOTAL_SPEND);

if first.sold_to_org_id then do;

   dif_time=.;

   dif_value=.;

end;

run;

ods listing;

options nocenter;

proc print;

run;



Question
Fluorite | Level 6

Hi EricT,

Many Thanks the code works perfectly...Smiley Happy

Cheers

Alice

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1666 views
  • 0 likes
  • 3 in conversation