BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Solly7
Pyrite | Level 9

Hi , I need help in summing up leave_days monthly for each user, and if the user did not take for leave on a particular calender_month, 0 value should display from the leave_days column corresponding to the calender_month which they did not take leave

 

Note that if user did not take leave, there will be no record on the leave table and if for example user took 2 days  leave on the same calender_month, then 2 separate records for the same calender_month with will appear in the leave table. 

                                               

                                                                  Data Have

data users;
  input users $50.;
  datalines;
Andrew Lup
Mike Mora
Farren Seb
;

data leave;
  input calender_month leave_days username $50.;
  datalines;
202101 1  Andrew Lup
202101 1  Andrew Lup
202102 1  Andrew Lup
202107 1  Andrew Lup
202101 1  Mike Mora
202101 1  Mike Mora
202101 1  Mike Mora
202107 1  Mike Mora
202107 1  Mike Mora
;

 

                                         data want

 

calender_month           username            leave_days

202101                        Andrew Lup               2

202102                        Andrew Lup               1

202103                       Andrew Lup                0

202104                       Andrew Lup                0

202105                       Andrew Lup                0

202106                       Andrew Lup                0

202107                       Andrew Lup                1

202101                       Mike Mora                  3

202102                       Mike Mora                  0

202103                       Mike Mora                  0

202104                       Mike Mora                  0

202105                       Mike Mora                  0

202106                       Mike Mora                  0

202107                       Mike Mora                  2

202101                       Farren Seb                 0

202102                       Farren Seb                 0

202103                       Farren Seb                 0

202104                       Farren Seb                 0

202105                       Farren Seb                 0

202106                       Farren Seb                 0

202107                       Farren Seb                 0

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

First of all, NEVER store a date value like this. NEVER.

Always store date and time related values as such in SAS (counts of days or seconds, starting at 1960-01-01, or midnight for pure times).

So your dataset leave has to look like this:

data leave;
  input calender_month :yymmn6. leave_days username &:$50.;
  format calender_month yymmn6.;
  datalines;
202101 1  Andrew Lup
202101 1  Andrew Lup
202102 1  Andrew Lup
202107 1  Andrew Lup
202101 1  Mike Mora
202101 1  Mike Mora
202101 1  Mike Mora
202107 1  Mike Mora
202107 1  Mike Mora
;

Next, create a dataset that has all months for all users:

%let start=%sysfunc(inputn(20210101,yymmdd8.));
%let end=%sysfunc(inputn(20210731,yymmdd8.));

data all_months;
set users;
format calender_month yymmn6.;
calender_month = &start.;
do while (calender_month le &end.);
  output;
  calender_month = intnx('month',calender_month,1,"b");
end;
run;

Then, join that, and use COALESCE:

proc sql;
create table want as
  select
    t1.users as username,
    t1.calender_month,
    coalesce(sum(t2.leave_days),0) as leave_days
  from all_months t1
  left join leave t2
  on t1.users = t2.username and t1.calender_month = t2.calender_month
  group by t1.users, t1.calender_month
 ;
 quit;

 

 

Edit: changed CASE to COALESCE

View solution in original post

6 REPLIES 6
andreas_lds
Jade | Level 19

What have you tried?

Is leave_days = 1 for all obs in your real data?

Why is calender_month not a formatted sas-date?

Which months do expect to see in want? Are minimum and maximum taken from the data?

Solly7
Pyrite | Level 9
Hi,
thanks for your reply, please see below answers to your questions

What have you tried? I have tried to use coalesce function with sum but did not get desired output as stated on my data want
Is leave_days = 1 for all obs in your real data? YES it is
Which months do expect to see in want? Are minimum and maximum taken from the data? Im working only on data from 202101 to 202112.
andreas_lds
Jade | Level 19

proc summary could be used to, all you need are two formats: one containing all users, the other containing all months you want to see in the result. Building the format for the names is easy:

data work.UserFmt;
   set users(rename=(users = start));
   
   length fmtname $ 32 label $ 50;
   retain fmtname '$users';

   label = start;
run;

proc format cntlin=work.UserFmt;
run;

For the month-dataset, i assume that calender_month has been fixed, so that it is a sas date:

data work.MonthFmt;   
   length 
      FmtName $ 32 
      start label 8
   ;

   retain FmtName 'MonthFmt';  
   format start label yymmn6.; /* just for better readability */

   start = '01Jan2021'd;

   do while (start <= '01Jul2021'd);
      label = Start;
      output;
      start = intnx('month', start, 1, 'b');
   end;
run;

proc format cntlin=work.MonthFmt;
run;

Counting days without using leave_days

proc summary data=work.leave completetypes nway;   
   class username calender_month / preloadfmt;
   format username $users. calender_month MonthFmt.;
   output out=work.want(drop=_type_ rename=(_freq_=leave_days));
run;

proc datasets library= work nolist;
   modify want;
      format calender_month yymmn6.;
quit;

Well, a lot more code than the solution suggested by @Kurt_Bremser, so most likely just a show room for proc summary and formats 😉

Kurt_Bremser
Super User

First of all, NEVER store a date value like this. NEVER.

Always store date and time related values as such in SAS (counts of days or seconds, starting at 1960-01-01, or midnight for pure times).

So your dataset leave has to look like this:

data leave;
  input calender_month :yymmn6. leave_days username &:$50.;
  format calender_month yymmn6.;
  datalines;
202101 1  Andrew Lup
202101 1  Andrew Lup
202102 1  Andrew Lup
202107 1  Andrew Lup
202101 1  Mike Mora
202101 1  Mike Mora
202101 1  Mike Mora
202107 1  Mike Mora
202107 1  Mike Mora
;

Next, create a dataset that has all months for all users:

%let start=%sysfunc(inputn(20210101,yymmdd8.));
%let end=%sysfunc(inputn(20210731,yymmdd8.));

data all_months;
set users;
format calender_month yymmn6.;
calender_month = &start.;
do while (calender_month le &end.);
  output;
  calender_month = intnx('month',calender_month,1,"b");
end;
run;

Then, join that, and use COALESCE:

proc sql;
create table want as
  select
    t1.users as username,
    t1.calender_month,
    coalesce(sum(t2.leave_days),0) as leave_days
  from all_months t1
  left join leave t2
  on t1.users = t2.username and t1.calender_month = t2.calender_month
  group by t1.users, t1.calender_month
 ;
 quit;

 

 

Edit: changed CASE to COALESCE

Solly7
Pyrite | Level 9
Thanks a lot Kurt..always learning a lot from you

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 523 views
  • 4 likes
  • 3 in conversation