BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
million
Calcite | Level 5

Hi,

I have a data that has three variables namely Department, Totalsales and Date. I have many departments. Each day, new sales adds to the the Totalsales column and shows the total until that specific day. I want to know how many new sells there are each day by subtracting one day's totalsales from the next day's totalsales and put it in a new column. Therefore at last I will have four columns (Department, Totalsales, Date, and newsells). I sorted the data by Department and Date. I tried to work using do loop but lost on the way. Can anybody assist me with this. I am posting a hypothetical data similar to my data here.

 

Thank you

 

Data All;
Input Department$ Totalsales Date DDMMYY8.;
format Date yymmdd10.;
cards;
D 45 01/03/19
D 47 02/03/19
D 51 03/03/19
D 55 04/03/19
B 35 01/03/19
B 37 02/03/19
B 43 03/03/19
B 42 04/03/19
E 59 03/03/19
E 64 04/03/19
M 45 01/03/19
M 50 02/03/19
M 55 03/03/19
M 56 04/03/19
M 59 05/03/19
M 59 07/03/19
M 59 06/03/19;
run;

I want new sales for each department in the first column.

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

A loop would be useful if you had to do the same process to a set of variables.  But you have only one variable (totalsales), and you want to compare successive observations within the data set:

 

Data All;
  Input Department$ Totalsales Date DDMMYY8.;
  format Date yymmdd10.;
cards;
D 45 01/03/19
D 47 02/03/19
D 51 03/03/19
D 55 04/03/19
B 35 01/03/19
B 37 02/03/19
B 43 03/03/19
B 42 04/03/19
E 59 03/03/19
E 64 04/03/19
M 45 01/03/19
M 50 02/03/19
M 55 03/03/19
M 56 04/03/19
M 59 05/03/19
M 59 07/03/19
M 59 06/03/19
run;
data want;
  set all;
  by department notsorted;
  newsales=dif(totalsales);
  if first.department then newsales=.;
run;

The DIF function is defined as   dif(x)=x-lag(x).  The "if first.department …" statement tests for the start of a new department, and set newsales to missing, thereby avoiding contamination of the start of one department with a lagged value from the prior department.

 

The "notsorted" option is because you seem to have your data observations grouped by department, but not in ascending order.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

3 REPLIES 3
mkeintz
PROC Star

A loop would be useful if you had to do the same process to a set of variables.  But you have only one variable (totalsales), and you want to compare successive observations within the data set:

 

Data All;
  Input Department$ Totalsales Date DDMMYY8.;
  format Date yymmdd10.;
cards;
D 45 01/03/19
D 47 02/03/19
D 51 03/03/19
D 55 04/03/19
B 35 01/03/19
B 37 02/03/19
B 43 03/03/19
B 42 04/03/19
E 59 03/03/19
E 64 04/03/19
M 45 01/03/19
M 50 02/03/19
M 55 03/03/19
M 56 04/03/19
M 59 05/03/19
M 59 07/03/19
M 59 06/03/19
run;
data want;
  set all;
  by department notsorted;
  newsales=dif(totalsales);
  if first.department then newsales=.;
run;

The DIF function is defined as   dif(x)=x-lag(x).  The "if first.department …" statement tests for the start of a new department, and set newsales to missing, thereby avoiding contamination of the start of one department with a lagged value from the prior department.

 

The "notsorted" option is because you seem to have your data observations grouped by department, but not in ascending order.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
million
Calcite | Level 5
Thank you so much. This works!

Just a followup question. What if we want the the start of a new department column to be the same as the totalsales value in that column? Not a missing value.
mkeintz
PROC Star
Sorry for the delayed response - been offline. I leave the answer to your question as a "homework" assignment.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 2129 views
  • 1 like
  • 2 in conversation