BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
wendy0327
Fluorite | Level 6

Hi - 

Column 1 is date values; column 2 is number of instances; column 3 is person attributable to the instances on the date.

i need to count up monthly totals of instance by person, except that my "months" run from 19th to the 18th (like March 19, 2020 to April 18, 2020).  I've got about 200,000 observations running back to January 2016.

 

I cannot figure out how to set month ranges since they are time periods not reflective of an actual calendar month.

I welcome any suggestions, because i don't even know where to begin.

 

12/7/2019 22 BILL
12/7/2019 1 AMY
11/9/2019 26 AMY
11/9/2019 3 JEFF
10/12/2019 30 AMY
10/12/2019 5 AMY
9/14/2019 16 JEFF
9/14/2019 6 BILL
8/17/2019 19 BILL

 

thanks,

Wendy

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Or create a proper format that you reuse any time with any date variable.

data _F;
  retain TYPE 'N' FMTNAME 'month_ends_19th_';
  do Y=2010 to 2030;
    do M= 1 to 12;
      START=mdy(M,20,Y);
      if M=12 then END=mdy(1,19,Y+1);
      else         END=mdy(M+1,19,Y);
      LABEL=catt(put(START,monyy.),'-',put(END,monyy.));
      output;
    end; 
  end; 
  format START END date9.;
run;
proc format cntlin=_F;
run;
proc summary data =HAVE nway;
  class PERSON MONTH;
  var INSTANCES;
  format MONTH month_ends_19th_.;
  output out=SUM sum=;
run;

 

 

View solution in original post

10 REPLIES 10
PaigeMiller
Diamond | Level 26

Create a new variable to indicate month, then run PROC SUMMARY

 

In this example, the new month variable will show APR20 for days March 19, 2020 to April 18, 2020. Naturally, you can pick other ways to display visually the month.

 

/* UNTESTED CODE */
data have2;
    set have;
    if day(date)>=19 then newmonth=intnx('month',date,1,'b');
    else newmonth=date;
    format newmonth monyy.;
run;
proc summary data=have2 nway;
    class person newmonth;
    var instances;
    output out=want sum=;
run;
    
 
 
--
Paige Miller
art297
Opal | Level 21

One way would be to create, then apply, a format. This would work for both proc means (if you only needed the printed result) and proc summary (if you wanted to create an output file):

data have;
  input date mmddyy10. instances person $;
  cards;
12/7/2019 22 BILL
12/7/2019 1 AMY
11/9/2019 26 AMY
11/9/2019 3 JEFF
10/12/2019 30 AMY
10/12/2019 5 AMY
9/14/2019 16 JEFF
9/14/2019 6 BILL
8/17/2019 19 BILL
;
run;

data fmtdata;
  retain fmtname 'date_range' type 'N';
  label=1;
  do start='01jan2016'd to '31dec2020'd;
    end=start;
    if day(start) eq 19 then label+1;
    output;
  end;
run;

proc format cntlin = fmtdata;
run;

proc means data=have sum;
  var instances;
  class person date;
  format date date_range.;
run;

Art, CEO, AnalystFinder.com

 

ballardw
Super User

And another possibly obnoxious approach. Create a NEW variable that has the start of the month. Then use that variable in anything creating groups like Proc Means/ summary/ tabulate/ report.

 

Examine this data example and see if the Moddate aligns with your month start.

data example;
   do date= '01Jan2020'd to '31Jul2020'd;
      moddate = ifn(day(date)<19,intnx('month',date,-1,'b')+18,intnx('month',date,0,'b')+18);
      output;
   end;
   format date moddate date9.;
run;

Use of a format like YYMON or YYMM on the moddate would indicate the start of the month.

 

It might help to show exactly what you expect the result to be for your example data so we aren't quite shooting fish in the forest.

FreelanceReinh
Jade | Level 19

Hi @wendy0327,

 

You can also create a dataset containing the definition of these custom intervals and then refer to this definition later on.

 

Adapting an example shown in the documentation:

/* Create dataset with user-defined date intervals "mymonth" */

options intervalds=(mymonth=mmds);
data mmds(keep=begin season);
start = '19DEC2015'd;
stop  = '19DEC2050'd; /* use a later stop date, if needed */
nmonths = intck('month',start,stop);
do i=0 to nmonths;
  begin = intnx('month',start,i,'s');
  season = month(begin);
  output;
end;
format begin date9.;
run;

/* Create summary statistics per person and interval */

proc sql;
create table want as
select person, intnx('mymonth',date,0) as startdt format=mmddyy10., sum(instances) as total_inst
from have
group by person, startdt;
quit;
Tom
Super User Tom
Super User

Just use INTNX() function to move the date to start of the month in the middle of the range.  Bascially you want to move by zero months when the day of the month is 18 or less and move to the next month when it is more than 18.

data have ;
  input date count name $;
  repmonth=intnx('month',date,day(date)>18);
  informat date mmddyy.;
  format date yymmdd10. repmonth yymm7.;
cards;
12/19/2019 22 BILL
12/7/2019 1 AMY
11/20/2019 26 AMY
11/9/2019 3 JEFF
10/22/2019 30 AMY
10/12/2019 5 AMY
9/14/2019 16 JEFF
9/14/2019 6 BILL
8/17/2019 19 BILL
;

proc print;
run;

proc means sum;
 class repmonth;
  var count;
run;

Results:

Obs          date    count    name    repmonth
 1     2019-12-19      22     BILL    2020M01
 2     2019-12-07       1     AMY     2019M12
 3     2019-11-20      26     AMY     2019M12
 4     2019-11-09       3     JEFF    2019M11
 5     2019-10-22      30     AMY     2019M11
 6     2019-10-12       5     AMY     2019M10
 7     2019-09-14      16     JEFF    2019M09
 8     2019-09-14       6     BILL    2019M09
 9     2019-08-17      19     BILL    2019M08

   Analysis Variable : count

              N
repmonth    Obs             Sum
-------------------------------
 2019M08      1      19.0000000
 2019M09      2      22.0000000
 2019M10      1       5.0000000
 2019M11      2      33.0000000
 2019M12      2      27.0000000
 2020M01      1      22.0000000
-------------------------------
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @wendy0327 

 

The idea here is to create a date value as the first day in any month, where the day value is > 18 in the preceding month. So april 17 is in april, but april 18 in may and so on, and do the summation on the calculated date and person. Note that the summation is done on a date value, even if it is formatted as yyyymm, and because of that it has to be the same day for all dates falling in the given summation-month.

 


data have;
	format date date9.;
	input date mmddyy10. number Person$;
	cards;
12/7/2019 22 BILL
12/7/2019 1 AMY
11/9/2019 26 AMY
11/9/2019 3 JEFF
10/12/2019 30 AMY
10/12/2019 5 AMY
9/14/2019 16 JEFF
9/14/2019 6 BILL
8/17/2019 19 BILL
9/18/2019 16 JEFF
9/22/2019 6 BILL
8/23/2019 19 BILL
;
run;

proc sql;
	create table tmp as
		select distinct
			Person,
			mdy(month(date)+(day(date)>18),1,year(date)) as CountMonth format=yymmn6.,
			sum(number) as SumNumber
		from have
		group by Person, CountMonth
		order by Person, CountMonth;
quit;

summation.gif

 

ChrisNZ
Tourmaline | Level 20

Or create a proper format that you reuse any time with any date variable.

data _F;
  retain TYPE 'N' FMTNAME 'month_ends_19th_';
  do Y=2010 to 2030;
    do M= 1 to 12;
      START=mdy(M,20,Y);
      if M=12 then END=mdy(1,19,Y+1);
      else         END=mdy(M+1,19,Y);
      LABEL=catt(put(START,monyy.),'-',put(END,monyy.));
      output;
    end; 
  end; 
  format START END date9.;
run;
proc format cntlin=_F;
run;
proc summary data =HAVE nway;
  class PERSON MONTH;
  var INSTANCES;
  format MONTH month_ends_19th_.;
  output out=SUM sum=;
run;

 

 

PaigeMiller
Diamond | Level 26

@ChrisNZ great idea, I didn't think of that, but your formats are off by one day. The "month" should start on the 19th and end on the 18th.

--
Paige Miller
ChrisNZ
Tourmaline | Level 20

@PaigeMiller oh no lol.

That will teach me to reply at midnight! 🙂

wendy0327
Fluorite | Level 6

all these suggestions have been awesome.  thank you all so much!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 904 views
  • 4 likes
  • 8 in conversation