i have a dataset which has different months data i need to sum the same months data and than take the difference the previous month
city month count
NEV 01-Dec-2018 50
NEV 01-Dec-2018 40
REN 01-Dec-2018 20
REN 01-Dec-2018 80
NEV 01-jan-2019 15
NEV 01-jan-2019 15
REN 01-jan-2019 10
REN 01-jan-2019 10
NEV 01-feb-2019 25
NEV 01-feb-2019 25
REN 01-feb-2019 30
REN 01-feb-2019 20
NEV 01-mar-2019 45
NEV 01-mar-2019 45
REN 01-mar-2019 25
REN 01-mar-2019 25
output:
city month count diff
NEV 01-Dec-2018 90
REN 01-Dec-2018 100
NEV 01-jan-2019 30 -60
REN 01-jan-2019 20 -80
NEV 01-feb-2019 50 20
REN 01-feb-2019 50 30
NEV 01-mar-2019 90 40
REN 01-mar-2019 50 0
PROC SUMMARY will find the sum by month and city. Then you can take the output from PROC SUMMARY, sort it by city and month, then in a data step, use the LAG function to find the differences.
Hi @hk2013 A good excercise for me to personally practice and test myself 🙂 Thank you for the fun question
data have;
input city $ month :date9. count ;
format month date9.;
cards;
NEV 01-Dec-2018 50
NEV 01-Dec-2018 40
REN 01-Dec-2018 20
REN 01-Dec-2018 80
NEV 01-jan-2019 15
NEV 01-jan-2019 15
REN 01-jan-2019 10
REN 01-jan-2019 10
NEV 01-feb-2019 25
NEV 01-feb-2019 25
REN 01-feb-2019 30
REN 01-feb-2019 20
NEV 01-mar-2019 45
NEV 01-mar-2019 45
REN 01-mar-2019 25
REN 01-mar-2019 25
;
data want;
if _n_=1 then do;
dcl hash H () ;
h.definekey ("city") ;
h.definedata ("_count") ;
h.definedone () ;
call missing(_count);
end;
do until(last.month);
set have;
by city month notsorted;
sum=sum(sum,count);
end;
if h.find()=0 then diff=sum-_count;
h.replace(key:city,data:sum);
keep city month sum diff;
run;
Are you programming or using the GUI in EG?
In SQL, that would be:
proc sql;
select
a.city,
a.month,
a.mCount,
a.mCount-b.mCount as countChange
from
(select city, month, sum(count) as mCount
from have group by city, month) as a left join
(select city, month, sum(count) as mCount
from have group by city, month) as b
on a.city=b.city and a.month=intnx("month", b.month, 1)
order by month, city;
quit;
This will give the correct answer even if there are some missing months.
If there was not gap in data.
data have;
input city $ month :date9. count ;
format month date9.;
cards;
NEV 01-Dec-2018 50
NEV 01-Dec-2018 40
REN 01-Dec-2018 20
REN 01-Dec-2018 80
NEV 01-jan-2019 15
NEV 01-jan-2019 15
REN 01-jan-2019 10
REN 01-jan-2019 10
NEV 01-feb-2019 25
NEV 01-feb-2019 25
REN 01-feb-2019 30
REN 01-feb-2019 20
NEV 01-mar-2019 45
NEV 01-mar-2019 45
REN 01-mar-2019 25
REN 01-mar-2019 25
;
proc summary data=have;
by city month notsorted;
var count;
output out=temp sum=;
run;
data want;
set temp(drop=_:);
dif=dif2(count);
run;
data have;
input city $ month :date9. count ;
format month DATE9.;
cards;
NEV 01-Dec-2018 50
NEV 01-Dec-2018 40
REN 01-Dec-2018 20
REN 01-Dec-2018 80
NEV 01-jan-2019 15
NEV 01-jan-2019 15
REN 01-jan-2019 10
REN 01-jan-2019 10
NEV 01-feb-2019 25
NEV 01-feb-2019 25
REN 01-feb-2019 30
REN 01-feb-2019 20
NEV 01-mar-2019 45
NEV 01-mar-2019 45
REN 01-mar-2019 25
REN 01-mar-2019 25
;
RUN;
PROC REPORT DATA=HAVE OUT=WANT22(DROP=_BREAK_);
COLUMN CITY MONTH COUNT MON YR;
DEFINE CITY/GROUP ;
DEFINE MONTH/GROUP;
DEFINE COUNT/SUM;
COMPUTE YR;
MON=MONTH(_C2_);
YR=YEAR(_C2_);
ENDCOMP;
RUN;
PROC SORT DATA=WANT22;
BY CITY YR MON;
RUN;
DATA WANT (KEEP=CITY MONTH COUNT DIFF);
SET WANT22;
BY CITY YR MON;
DIFF=DIF(COUNT);
IF FIRST.CITY THEN CALL MISSING(DIFF);
RUN;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.