BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
amanda_cr
Calcite | Level 5

Hi,

 

I want to create a new variable "MonthPaid" that is dependent on two other date variables. Here is how I created the variable for one month (January 2016):

 

data new;
set old;
if (pwrkcode='D' and workcode='P' and ('01jan2016'd <= dtclpaid <='31jan2016'd) and ('01jan2016'd <= dtwork <='31jan2016'd)) 
then MonthPaid='Jan-16';

run;

 

How would I turn this data step into an array/macro to create the variable for 12 months (I want 'MonthPaid' to populate with 'Jan-16', 'Feb-16', 'Mar-16', ........'Dec-16')? Ideally, I'd like to factor in the year, as well, ('MonthPaid would also populate with 'Jan-17', 'Feb-17', etc.) but if that's too complicated, I'm fine having a data step for each year of interest.

 

If there's a better approach, I'm happy to adjust my plan.

 

AC

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

OK, using your added cases and using Tom's idea to switch to "2016-01" instead of "01-2016":

 

data new;
set old;
if (pwrkcode='D' and workcode='P') then do;

   if put(dtclpaid, yymmd7.) = put(dtwork, yymmd7.) then MonthPaid=put(dtwork, yymmd7.);

end;

if (wkstat='C' and workcode in ('P','D','PL','RS','PH','PT','DH') and workflag in ('A','G','N'))

the MonthPaid = put(dtclpaid, yymmd7.);

run;

 

There are many ways to format the logic ... just pick one that you find easy to interpret.

View solution in original post

9 REPLIES 9
Tom
Super User Tom
Super User

If you want to convert a date to the first of the month then just use the INTNX() function.  You can then attach a format that only displays the Year and Month.  Or use the PUT() function if you want to store the value as a character string instead, but using a DATE will probably give you more flexibility.

 

IF pwrkcode='D' and workcode='P' 
 and intnx('month',dtclpaid,0,'b') = intnx('month',dtwork ,0,'b') 
THEN MonthPaid = intnx('month',dtclpaid,0,'b')
;
format MonthPaid yymm7. ;
amanda_cr
Calcite | Level 5

 

Wow. This is a whole new way of thinking about the logic. So actually, the entire logic is this:

 

if (pwrkcode='D' and workcode='P' and ('01jan2016'd <= dtclpaid <='31jan2016'd) and ('01jan2016'd <= dtwork <='31jan2016'd))

or
(wkstat='C' and workcode in ('P','D','PL','RS','PH','PT','DH') and workflag in ('A','G','N') and ('01jan2016'd <= dtclpaid <='31jan2016'd))

then MonthPaid='Jan-16';

 

I don't think it would work to replace that second dtclpaid with the intnx format. Correct?

 

 

 

Tom
Super User Tom
Super User

If you want to replace the value with character strings then you could just use one of the available date formats that ignore the day of the month. Or strip out the day of the month from one that does.  If you do store it into a character variable I would recommend using one in YYYYMM format in some form so that it will sort properly with January coming before December.

 

Not sure why you need to test if the date is within the month.  You might want to test if the date is within reason.

amanda_cr
Calcite | Level 5

Ok. You've make me re-think my logic, and I need to discuss this with the business again. Thanks for the feedback and making me think more critically. 

 

Amanda

Astounding
PROC Star

I would suggest changing the outcome variable.  Use four-digit years, and a format that would sort in chronological order such as "01-2016".  At any rate, here is one way to get it all done in one step:

 

data new;
set old;
if (pwrkcode='D' and workcode='P' then do;

   if put(dtclpaid, mmyyd7.) = put(dtwork, mmyyd7.) then MonthPaid=put(dtwork, mmyyd7.);

end;

run;

 

Note that you still need to make decisions:

 

  • Which values to subset later
  • How to handle cases where dtclpaid and dtwork fall into different months
amanda_cr
Calcite | Level 5

Changing the outcome variable is a good idea. I wanted to simplify my code for this discussion, but in doing so I removed some relevant logic. Here is the entire logic I need to use:

 

if (pwrkcode='D' and workcode='P' and ('01jan2016'd <= dtclpaid <='31jan2016'd) and ('01jan2016'd <= dtwork <='31jan2016'd))

or
(wkstat='C' and workcode in ('P','D','PL','RS','PH','PT','DH') and workflag in ('A','G','N') and ('01jan2016'd <= dtclpaid <='31jan2016'd))

then MonthPaid='Jan-16';

 

Note:

I don't want to populate the the 'MonthPaid' variable unless both dtclpaid and dtwork fall in the same month, so this logic is ok.

Astounding
PROC Star

OK, using your added cases and using Tom's idea to switch to "2016-01" instead of "01-2016":

 

data new;
set old;
if (pwrkcode='D' and workcode='P') then do;

   if put(dtclpaid, yymmd7.) = put(dtwork, yymmd7.) then MonthPaid=put(dtwork, yymmd7.);

end;

if (wkstat='C' and workcode in ('P','D','PL','RS','PH','PT','DH') and workflag in ('A','G','N'))

the MonthPaid = put(dtclpaid, yymmd7.);

run;

 

There are many ways to format the logic ... just pick one that you find easy to interpret.

rogerjdeangelis
Barite | Level 11
Using logic in meta data to extract payment data

inspired by
https://goo.gl/orP6AA
https://communities.sas.com/t5/Base-SAS-Programming/How-do-a-I-use-an-array-or-macro-to-replace-my-repeating-code/m-p/343776

You need to cycle through logic involving four arrays
using  date ranges and constants for every record
in the paid dataset.

This can be solved using SQL quite easily but I don't think
that was the solution the op wanted.
HASH may also be possible, but the logic in the
mata data can change.

HAVE  Two datasets meta and paid datasets
====
     HAV1ST with valid data data combinations of DTC and DTW)
     HAV2ND data to be checked against meta data and

This defines the combinations(meta data) of valid  we want to check against

Up to 40 obs WORK.HAV1ST total obs=16

HAV1ST

Obs    PWRKCODE    WORKCODE      DTC        DTW

  1       D           P        JAN2016    JAN2016  * matches a range in data
  2       D           P        JAN2016    MAY2016
  3       D           P        JAN2016    JUL2016
  4       D           P        JAN2016    AUG2016

  5       D           P        MAY2016    JAN2016
  6       D           P        MAY2016    MAY2016  * matches
  7       D           P        MAY2016    JUL2016
  8       D           P        MAY2016    AUG2016

  9       D           P        JUL2016    JAN2016
 10       D           P        JUL2016    MAY2016
 11       D           P        JUL2016    JUL2016  *
 12       D           P        JUL2016    AUG2016

 13       D           P        AUG2016    JAN2016  * no match
 14       D           P        AUG2016    MAY2016  * no match
 15       D           P        AUG2016    JUL2016  * no match
 16       D           P        AUG2016    AUG2016  * no match

HAV22ND (This is the main dataset)

Obs    PWRKCODE    WORKCODE    DTCLPAID      DTWORK

 1        D           P        15JAN2016    15JAN2016  * this is valid  (in HAV1ST month January)
 2        D           P        15JAN2016    22JAN2016  * this is valid   (in HAV1ST)

 3        X           P        15MAR2016    15MAR2016  * invalid not in HAV1ST
 4        X           P        15MAR2016    22MAR2016  * invalid not in HAV1ST

 5        D           P        15MAY2016    15MAY2016  * this is valid  (in HAV1ST)
 6        D           P        15MAY2016    22MAY2016  * this is valid   (in HAV1ST)

 7        D           P        15JUL2016    15JUL2016
 8        D           P        15JUL2016    22JUL2016

WANT   (Months where customers paid)
====

Macro variable

  Month_paid = MONTHPAID=JAN2016 MAY2016 JUL2016

DETAILS

You need to cycle through logic involving four arrays
using  date ranges and constants for every record
in the paid dataset.
Note I only used 31 day months to simplify the problem.

    if     pwrkcode=pwr[mmyy]
       and workcode=wrk[mmyy]
       and (dtclower[mmyy] <= dtclpaid <=dtcupper[mmyy])
       and (dtwlower[mmyy] <= dtwork   <=dtwupper[mmyy])
            then Month=dtc[mmyy];

*                _                  _       _
 _ __ ___   __ _| | _____        __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| |   <  __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___|      \__,_|\__,_|\__\__,_|

;

proc datasets lib=work kill;
run;quit;

data hav1st;
  retain pwrkcode 'D'  workcode 'P';
  do  dtc = 'JAN2016','MAY2016','JUL2016','AUG2016';
    do  dtw = 'JAN2016','MAY2016','JUL2016','AUG2016';
     output;
    end;
  end;
run;quit;

data hav2nd;

  retain pwrkcode 'D'  workcode 'P' dtclpaid '01jan2016'd dtwork '31jan2016'd;
  format dtclpaid dtwork date9.;
  do  dtclpaid = '15jan2016'd,'15mar2016'd,'15may2016'd,'15jul2016'd;
      if dtclpaid = '15mar2016'd then pwrkcode='X';else pwrkcode='D';
      dtwork = dtclpaid;
      output;
      dtwork = dtwork + 7;
      output;
  end;

run;quit;

*          _       _   _
 ___  ___ | |_   _| |_(_) ___  _ __
/ __|/ _ \| | | | | __| |/ _ \| '_ \
\__ \ (_) | | |_| | |_| | (_) | | | |
|___/\___/|_|\__,_|\__|_|\___/|_| |_|

;
%symdel dtc dtw pwr wrk monthpaid;

data _null_;

   * note it is good idea to do the sql code
     here because macro variable sqlobs is very volatile;
   * create data for logic arrays;
   if _n_=0 then do;
      %sysfunc(dosubl('
          proc sql;
            select
               quote(min(dtc))
              ,quote(min(dtw))
              ,quote(min(pwrkcode))
              ,quote(min(workcode))
            into
              :dtc separated by ','
             ,:dtw separated by ','
             ,:pwr separated by ','
             ,:wrk separated by ','
            from
               hav1st
            group
               by  dtc
                  ,dtw
                  ,pwrkcode
                  ,workcode
          ;quit;
     '));
   end;

   length monthpaid $1000;
   retain monthpaid ' ';

   array dtc[&sqlobs] $7 (&dtc.) ;
   array dtw[&sqlobs] $7 (&dtw.)   ;
   array pwr[&sqlobs] $7 (&pwr.)   ;
   array wrk[&sqlobs] $7 (&wrk.)   ;

   set hav2nd end=dne;
   format _all_;

   do mmyy=1 to &sqlobs;

      dtc_1=input(cats("01",dtc[mmyy]),date9.);
      dtc_2=input(cats("31",dtc[mmyy]),date9.);
      dtw_1=input(cats("01",dtw[mmyy]),date9.);
      dtw_2=input(cats("31",dtw[mmyy]),date9.);

      if     pwrkcode=pwr[mmyy]
         and workcode=wrk[mmyy]
         and (dtc_1 <= dtclpaid <=dtc_2)
         and (dtw_1 <= dtwork   <=dtw_2)
              then Month=dtc[mmyy];

      if month ne '' then leave;

   end;

   if indexw(monthpaid,month)=0 then monthpaid=catx(' ',monthpaid,month);

   if dne then call symputx('monthpaid',monthpaid);

run;quit;

%put &=monthpaid;

MONTHPAID=JAN2016 MAY2016 JUL2016


SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1717 views
  • 0 likes
  • 4 in conversation