BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
RedUser77
Obsidian | Level 7

I want to calculate a month over month percentage change for volumes using the below data where I would only be looking at two months each time. Is there a Proc SQL statement that can help? I am struggling on grouping by the month of the date, which I need in order to calculate MoM percent change. 

Input:

D‌ate         Record_Number
20230101     10
20230102     245
20230121     39
20230201     11
20230202     202

 Output:

Percent_Change
(27.55%)

 *Both Date and Record_Number are of numeric format. 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26
data have1;
    set have;
    date1=input(put(date,8.),yymmdd8.);
    format date1 yymmn6.;
run;
proc summary data=have1 nway;
    class date1;
    var record_number;
    output out=_sums_(drop=_:) sum=;
run; 
data want;
    merge _sums_(firstobs=2 rename=(record_number=record_number2) drop=date1) _sums_;
    mom_pct_chg=(record_number2-record_number)/record_number;
    format mom_pct_chg percent10.2;
run;

 

The variable DATE is not a date variable, according to SAS. It looks like a date to humans, but SAS thinks it is some integer. Dates in SAS are the number of days since 01JAN1960, and representing dates as number of days since 01JAN1960 allows you to use all the hard work SAS has done to figure out dates and months and leap years and whatever, so you don't have to code it yourself. 

 

Thus I turn the variable DATE into DATE1 which is now a valid SAS date, the number of days since 01JAN1960, and assign it the format yymmn6. so it still looks like 202301 (or whatever) and now PROC SUMMARY can group by this month value. From there, you compute the sums, and in the next step do the division.

 

I'm sure it can be done in SQL, but you would also be wise to learn other methods, such as the above. I suspect using DATA steps and PROC SUMMARY will run faster than SQL on your large data set but of course, I haven't tried to confirm that. The "I must use SQL for everything" approach is not one I recommend, see Maxim 10.

--
Paige Miller

View solution in original post

9 REPLIES 9
PaigeMiller
Diamond | Level 26

Please explain the logic and calculations used here. I don't see how you get (27.55%) from this data.

 

In words, explain how to do the grouping by months that you mention. 

 

I want to calculate a month over month percentage change for volumes using the below data

 

What variable is the volume?

--
Paige Miller
yabwon
Onyx | Level 15

You can try like that:

data have;
  infile CARDS4;
  input date yymmdd8. Record_Number;
  format date yymmdd10.;
cards4;
20230101 10
20230102 245
20230121 39
20230201 11
20230202 202
;;;;
run;
proc print;
run;

proc sql;
create table aggregate as
select distinct
  year(date) as y
, month(date) as m
, year(intnx("Month",date,-1)) as py
, month(intnx("Month",date,-1)) as pm
, sum(Record_Number) as sum_of_Record_Number
from
  have
group by
  year(date)
, month(date)
;
quit;

proc print data=aggregate;
run;


proc sql;
create table want as
select distinct
 cur.y
,cur.m
,cur.sum_of_Record_Number as sum_cur
,prev.sum_of_Record_Number as sum_prev
,(cur.sum_of_Record_Number - prev.sum_of_Record_Number)/cur.sum_of_Record_Number as percent_change format percentN10.2
from
  aggregate as cur
  join
  aggregate as prev
  on
  cur.y=prev.py
  and
  cur.m=prev.pm
;
quit;

proc print data=want;
run;

but I would use the data step if I were you:

proc sql;
create table aggregate2 as
select distinct
  year(date) as y
, month(date) as m
, sum(Record_Number) as sum_of_Record_Number
from
  have
group by
  year(date)
, month(date)
order by
1,2
;
quit;

proc print data=aggregate2;
run;


data want2;
  set aggregate2;
  percent_change = (lag(sum_of_Record_Number)-sum_of_Record_Number)/lag(sum_of_Record_Number);

  format percent_change percentN10.2;
  if percent_change;
run;

proc print data=want2;
run;

 

Bart

 

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



RedUser77
Obsidian | Level 7
The data is stored on tables that I am querying from so ideally I would like to set up a proc sql statement querying the above.
Patrick
Opal | Level 21

@RedUser77 wrote:
The data is stored on tables that I am querying from so ideally I would like to set up a proc sql statement querying the above.

If the source tables are stored in SAS then there is no reason not to use a SAS datastep.

If these are tables in a database then also consider explicit passthrough SQL as many databases got SQL extensions like windowing functions that allow for such calculations.

RedUser77
Obsidian | Level 7

Take the sum of Record_Number by the month of Date and then calculate the percent change.

 

For January (01): Sum of Record_Number = 294

For February (02): Sum of Record_Number = 213

Percent_Change: 100*((CURRENT - PRIOR)/(PRIOR))

Percent_Change = ((213 - 294)/(294)) = -27.55

 

The Date variable is YYYYMMDD. 

PaigeMiller
Diamond | Level 26
data have1;
    set have;
    date1=input(put(date,8.),yymmdd8.);
    format date1 yymmn6.;
run;
proc summary data=have1 nway;
    class date1;
    var record_number;
    output out=_sums_(drop=_:) sum=;
run; 
data want;
    merge _sums_(firstobs=2 rename=(record_number=record_number2) drop=date1) _sums_;
    mom_pct_chg=(record_number2-record_number)/record_number;
    format mom_pct_chg percent10.2;
run;

 

The variable DATE is not a date variable, according to SAS. It looks like a date to humans, but SAS thinks it is some integer. Dates in SAS are the number of days since 01JAN1960, and representing dates as number of days since 01JAN1960 allows you to use all the hard work SAS has done to figure out dates and months and leap years and whatever, so you don't have to code it yourself. 

 

Thus I turn the variable DATE into DATE1 which is now a valid SAS date, the number of days since 01JAN1960, and assign it the format yymmn6. so it still looks like 202301 (or whatever) and now PROC SUMMARY can group by this month value. From there, you compute the sums, and in the next step do the division.

 

I'm sure it can be done in SQL, but you would also be wise to learn other methods, such as the above. I suspect using DATA steps and PROC SUMMARY will run faster than SQL on your large data set but of course, I haven't tried to confirm that. The "I must use SQL for everything" approach is not one I recommend, see Maxim 10.

--
Paige Miller
mkeintz
PROC Star

It's not clear why SQL is advantageous for this task.  If the data are sorted by date, then a single data step does what you need:

 

data have;
  input date  yymmdd8.   Record_Number;
  format date yymmddn8.;
datalines;
20230101     10
20230102     245
20230121     39
20230201     11
20230202     202
run;

data want (keep=date prior_sum current_sum  pct_change);
  merge have
        have (firstobs=2 keep=date rename=(date=nxt_date))
        end=end_of_merge;

  date=intnx('month',date,0,'end');     ** Convert each date to end_of_month;
  if date^=lag(date) then current_sum=0;** New month?  Reset sum to zero ;
  current_sum+record_number;
  if nxt_date>date or end_of_merge=1;   **Subsetting if: of end-of-month or end_of_merge;
  prior_sum=lag(current_sum);
  if prior_sum ^=. then pct_change=(current_sum-prior_sum)/prior_sum;
  format pct_change percent7.1;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PaigeMiller
Diamond | Level 26

Personally, I would advise people not to write their own DATA step code to perform calculations (like sums) across groups that can easily be obtained from PROC MEANS or PROC SUMMARY.  For beginners, writing DATA step code to do this can be very difficult, harder than using PROC MEANS or PROC SUMMARY. SAS has done the hard work to get these calculations correct; there's no need for a user to replicate these calculations in a DATA step.

 

Furthermore, the DATA step written by users, if not done properly (but not the code by @mkeintz), could fail if there are missing values (we have seen this happen). This is another reason to use PROC MEANS or PROC SUMMARY, these PROCs handle missings properly.

--
Paige Miller
AhmedAl_Attar
Rhodochrosite | Level 12

Hi @RedUser77 

I know you have multiple answers to go through, and here is yet another one 

data have;
  infile CARDS4;
  input date yymmdd8. Record_Number;
  format date yymmdd10.;
cards4;
20230101 10
20230102 245
20230121 39
20230201 11
20230202 202
20230301 41
20230321 59
20230302 92
;;;;
run;

/* Create a view to add the month variable */
DATA HAVE_V/VIEW=HAVE_V;
	set have;
	month = put(date,yymon8.);
RUN;

/* Use the DOW loop to calculate monthly totals and Percent changes */
DATA WANT(KEEP=MONTH MON_TOT PCT_CHANGE);
	mon_tot = 0;
	prev_mon_tot = 0;
	
	/* calculate monthly totals */
	DO UNTIL (LAST.MONTH);
		SET HAVE_V;
		BY MONTH NOTSORTED;
		mon_tot + Record_Number;
	END;
	
	/* calculate monthly Percent changes */
	prev_mon_tot = lag(mon_tot);
	pct_change = ifn((prev_mon_tot > 0),100*((mon_tot - prev_mon_tot)/(prev_mon_tot)),0);
	
	OUTPUT;
	FORMAT pct_change 8.2;
RUN;

Note: this solution assumes the original input table/dataset is ordered by date.

 

Hope this helps,

Ahmed 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 734 views
  • 2 likes
  • 6 in conversation