BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nash_sas
Fluorite | Level 6

Hi,

 

I am working to bucket the given number of days for each task to current and consecutive months from the given start date.

 

Source -

Task       start_date       number_of_days

Task1      01Feb2017         48

 

Task2      15Mar2017         60

 

Because Feb2017 has 28 days in that month, number_of_days given is 48, so 48 - 28 = 20. Those 20 should fall into next month March2017

 

output needed -

Task    start_Date        Feb2017 Mar2017 Apr2017

Task1  01Feb2017       28            20

Task2  15Mar2017       0              15           45

 

 

Any suggestions are greatly appreciated !

1 ACCEPTED SOLUTION

Accepted Solutions
collinelliot
Barite | Level 11

 

Below is what I'd do, but you'll probably get better solutions from others. I'm assuming that ending up with 45 days in April201 is an error, but the basic approach is to expand the observations to cover the necessary months and then transpose. You could probably do something with arrays, but I think this will be more dynamic. Depending on the real set of data, you might need to tweak it a bit, but it should get you started. 

 

 

data have;
    input task $ start_date :date9. number_of_days;
    format start_date date9.;
datalines;
Task1 01Feb2017 48
Task2 15Mar2017 60
;

data want;
    set have;
    /* Calculate the end date based on the number of days. */
    end_date = start_date + number_of_days;
    /* Iterate over how many months the span covers. */
    do i = 1 to (intck('month', start_date, end_date) + 1);
        /* For every month, determine the first and last days. */
        fdom = intnx('month', start_date, i - 1, 'B');
        ldom = intnx('month', start_date, i - 1, 'E');
        /* Use these to calculate different date spans. */
        full_month_days = ldom - fdom + 1;
        part_month_days = min(full_month_days, end_date - fdom, ldom - start_date + 1); 
        /* Determine the number of days in the final segment. */
        days = min(full_month_days, part_month_days);
        /* Give the observation a name for transposition. */
        name = substr(put(fdom, date9.), 3);
        output;
    end; 
    format end_date fdom ldom date9.;
run;

proc transpose data = want out = want2;
    by task start_date number_of_days;
    var days;
    id name;
run;

View solution in original post

6 REPLIES 6
collinelliot
Barite | Level 11

 

Below is what I'd do, but you'll probably get better solutions from others. I'm assuming that ending up with 45 days in April201 is an error, but the basic approach is to expand the observations to cover the necessary months and then transpose. You could probably do something with arrays, but I think this will be more dynamic. Depending on the real set of data, you might need to tweak it a bit, but it should get you started. 

 

 

data have;
    input task $ start_date :date9. number_of_days;
    format start_date date9.;
datalines;
Task1 01Feb2017 48
Task2 15Mar2017 60
;

data want;
    set have;
    /* Calculate the end date based on the number of days. */
    end_date = start_date + number_of_days;
    /* Iterate over how many months the span covers. */
    do i = 1 to (intck('month', start_date, end_date) + 1);
        /* For every month, determine the first and last days. */
        fdom = intnx('month', start_date, i - 1, 'B');
        ldom = intnx('month', start_date, i - 1, 'E');
        /* Use these to calculate different date spans. */
        full_month_days = ldom - fdom + 1;
        part_month_days = min(full_month_days, end_date - fdom, ldom - start_date + 1); 
        /* Determine the number of days in the final segment. */
        days = min(full_month_days, part_month_days);
        /* Give the observation a name for transposition. */
        name = substr(put(fdom, date9.), 3);
        output;
    end; 
    format end_date fdom ldom date9.;
run;

proc transpose data = want out = want2;
    by task start_date number_of_days;
    var days;
    id name;
run;
mkeintz
PROC Star

Here's a minor modificatino of the above, shortening the program a litte:

 

data have;
  input task :$5. start_date :date9.  number_of_days;
  format start_date date9.;
datalines;
task1 01feb2017 48
task2 15mar2017 60
run;

data need (keep=task start_date number_of_days monthname monthdays);
  set have;
  do d=start_date by 0 until(d>=start_date+number_of_days);
    monthname=left(put(d,yymon7.));
    limit =min(start_date+number_of_days,intnx('month',d,0,'end')+1);
    monthdays=limit-d;
    d=intnx('month',d,1,'beg');
    output;
  end;
run;
proc transpose data=need out=want ;
  by task start_date number_of_days notsorted;
  var monthdays;
  id monthname;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
collinelliot
Barite | Level 11

The only concern I'd have with alternate solution is excluding the year in the transposed variable name. It might not matter, but if the number of days is ever more than a year or cross into a new year, you might end up with duplicates that won't transpose or ambiguous column names. 

mkeintz
PROC Star

 

 


@collinelliot wrote:

The only concern I'd have with alternate solution is excluding the year in the transposed variable name. It might not matter, but if the number of days is ever more than a year or cross into a new year, you might end up with duplicates that won't transpose or ambiguous column names. 


@collinelliot

 

Good point.  To address that issue, I editted the date format inserted into monthname from MONNAME8.  to YYMON7.

 

THX

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
data have;
input Task  $     start_date   : date9.    number_of_days;
format start_date date7.;
cards;
Task1      01Feb2017         48
Task2      15Mar2017         60
;
run;
data temp;
 set have;
 do i=0 to number_of_days-1;
  date=start_date+i;output;
 end;
 keep task start_date date;
run;
proc freq data=temp noprint;
 table task*start_date*date/out=temp1 list nopercent nocum ;
 format date monyy5.;
run;
proc transpose data=temp1 out=temp2(drop=_:);
by task start_date;
var count;
id date;
run;
proc stdize data=temp2 out=want missing=0 reponly;
run;
nash_sas
Fluorite | Level 6

Thank you everyone for your prompt solution. 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 5360 views
  • 4 likes
  • 4 in conversation