Hi,
I have a data that has three variables namely Department, Totalsales and Date. I have many departments. Each day, new sales adds to the the Totalsales column and shows the total until that specific day. I want to know how many new sells there are each day by subtracting one day's totalsales from the next day's totalsales and put it in a new column. Therefore at last I will have four columns (Department, Totalsales, Date, and newsells). I sorted the data by Department and Date. I tried to work using do loop but lost on the way. Can anybody assist me with this. I am posting a hypothetical data similar to my data here.
Thank you
Data All;
Input Department$ Totalsales Date DDMMYY8.;
format Date yymmdd10.;
cards;
D 45 01/03/19
D 47 02/03/19
D 51 03/03/19
D 55 04/03/19
B 35 01/03/19
B 37 02/03/19
B 43 03/03/19
B 42 04/03/19
E 59 03/03/19
E 64 04/03/19
M 45 01/03/19
M 50 02/03/19
M 55 03/03/19
M 56 04/03/19
M 59 05/03/19
M 59 07/03/19
M 59 06/03/19;
run;
I want new sales for each department in the first column.
A loop would be useful if you had to do the same process to a set of variables. But you have only one variable (totalsales), and you want to compare successive observations within the data set:
Data All;
Input Department$ Totalsales Date DDMMYY8.;
format Date yymmdd10.;
cards;
D 45 01/03/19
D 47 02/03/19
D 51 03/03/19
D 55 04/03/19
B 35 01/03/19
B 37 02/03/19
B 43 03/03/19
B 42 04/03/19
E 59 03/03/19
E 64 04/03/19
M 45 01/03/19
M 50 02/03/19
M 55 03/03/19
M 56 04/03/19
M 59 05/03/19
M 59 07/03/19
M 59 06/03/19
run;
data want;
set all;
by department notsorted;
newsales=dif(totalsales);
if first.department then newsales=.;
run;
The DIF function is defined as dif(x)=x-lag(x). The "if first.department …" statement tests for the start of a new department, and set newsales to missing, thereby avoiding contamination of the start of one department with a lagged value from the prior department.
The "notsorted" option is because you seem to have your data observations grouped by department, but not in ascending order.
A loop would be useful if you had to do the same process to a set of variables. But you have only one variable (totalsales), and you want to compare successive observations within the data set:
Data All;
Input Department$ Totalsales Date DDMMYY8.;
format Date yymmdd10.;
cards;
D 45 01/03/19
D 47 02/03/19
D 51 03/03/19
D 55 04/03/19
B 35 01/03/19
B 37 02/03/19
B 43 03/03/19
B 42 04/03/19
E 59 03/03/19
E 64 04/03/19
M 45 01/03/19
M 50 02/03/19
M 55 03/03/19
M 56 04/03/19
M 59 05/03/19
M 59 07/03/19
M 59 06/03/19
run;
data want;
set all;
by department notsorted;
newsales=dif(totalsales);
if first.department then newsales=.;
run;
The DIF function is defined as dif(x)=x-lag(x). The "if first.department …" statement tests for the start of a new department, and set newsales to missing, thereby avoiding contamination of the start of one department with a lagged value from the prior department.
The "notsorted" option is because you seem to have your data observations grouped by department, but not in ascending order.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.