DATA Step, Macro, Functions and more

Help to convert this hard coding to a macro

Accepted Solution Solved
Reply
Super Contributor
Posts: 338
Accepted Solution

Help to convert this hard coding to a macro

Hi SAS Forum,

Could you please help me to convert this hard code to a macro.

I am a macro beginner.

/*Below are the hard coded statements*/

data Cycle_FEB_TO_MAR_2010;

set RAW_DATA_SET_3_sorted;

   if '1FEB2010'd le current_date le '30MAR2010'd;

run;

/*This was how I tried to convert the above into a macro as I need to repeat the above same code 24 times*/

%let t_1='1FEB2010'd;

%let t_2='31MAR2010'd;

data Cycle_%substr(&t_1,2,4)_TO_%substr(&t_2,3,5)_%substr(&t_1,5,8);

set RAW_DATA_SET_3_sorted;

   if &t_1 le current_date le &t_2;

run;

Problem:

I warning comes and get stuck.

WARNING: Argument 3 to macro function %SUBSTR is out of range.

Could any one help?

Thanks

Mirisage


Accepted Solutions
Solution
‎09-30-2012 09:18 PM
Respected Advisor
Posts: 3,893

Re: Help to convert this hard coding to a macro

I wasn't sure of how to calculate "Balance %" (percent of which total?) but else below code should do what you asked for. You might want to give the datasets different names now that everything is done in one go.

data RAW_DATA_SET_3;
  inFormat Current_date date9.;
  input Bank_number Account_number $ 4-12 Current_date Product $ 24-36 Balance Arrears_Band $43-49;
  format Current_date date9.;
  cards;
10 1111111   31MAR2010 Personal Loan 555  30 - 60
10 1111111   30APR2010 Personal Loan 200  Current
10 1111111   31MAY2010 Personal Loan 108  1 - 30
30 1111111   31DEC2010 Res. Mortgage 800  Current
30 1111111   31JAN2011 Res. Mortgage 600  90 +
30 1111111   28FEB2011 Res. Mortgage 600  NPNA
30 1111111   17MAR2011 Res. Mortgage 600  writoff
20 333333333 15MAR2010 Personal OD   700  NPNA
20 333333333 30APR2010 Personal OD   30   NPNA
20 333333333 31MAY2010 Personal OD   2    NPNA
20 333333333 30JUN2010 Personal OD   1    NPNA
10 444444444 28FEB2010 Personal Loan 900  1 - 30
10 444444444 31MAR2010 Personal Loan 800  60 - 90
10 444444444 31MAY2010 Personal Loan 20   NPNA
10 444444444 01JUL2010 Personal Loan 25   NPNA
10 444444444 30AUG2010 Personal Loan 32   NPNA
70 666666666 07FEB2010 Personal OD   999  Current
70 666666666 15MAR2010 Personal OD   666  Current
40 777777    08FEB2010 Res. Mortgage 1877 Current
40 777777    28MAR2010 Res. Mortgage 905  Current
80 777777    01FEB2010 Personal Loan 3000 90 +
;
run;

/* create format for date cycle */

proc sql;

  create view v_cyclefmt as

    select distinct

      'Cycle2Month' as fmtname

      ,'n' as type

      ,current_date as start

      ,catx(' ',put(intnx('month',current_date,-1,'b'),monname.),put(intnx('month',current_date,-1,'b'),year4.),'to',put(current_date,monname.)) as label

    from RAW_DATA_SET_3

  ;

quit;

proc format cntlin=v_cyclefmt;
run;

/* create format for sort order of Arrears_Band  */
proc format ;
  invalue ArrearsCode
    'CURRENT' =1
    '1-30'    =2
    '30-60'   =3
    '60-90'   =4
    '90+'     =5
    'NPNA'    =6
    'WRITOFF' =7
    OTHER     =99
  ;
  value ArrearsText
    1='Current'
    2='1 - 30'
    3='30 - 60'
    4='60 - 90'
    5='90 +'
    6='NPNA'
    7='writoff'
    99='misscoded!'
  ;
run;

/*Sorting*/
proc sort data= RAW_DATA_SET_3 out=RAW_DATA_SET_3_sorted;
  by bank_number account_number current_date;
run;


data Cycle_FEB_TO_MAR_2010_iso_;
  set RAW_DATA_SET_3_sorted;
  by bank_number account_number current_date;
  Arrears_Band_CD=input(upcase(compress(Arrears_Band)),ArrearsCode.);
  prior_Arrears_Band_CD = lag( Arrears_Band_CD );
  if intck('month',lag(current_date),current_date) ne 1 then call missing(prior_Arrears_Band_CD);

  if first.account_number or missing(prior_Arrears_Band_CD) then delete;
run;

options missing=' ';
proc tabulate data= Cycle_FEB_TO_MAR_2010_iso_ order=formatted;
  format current_date Cycle2Month. Arrears_Band_CD prior_Arrears_Band_CD ArrearsText.;
  label current_date='Cycle';
  class current_date / order=unformatted ;
  class prior_Arrears_Band_CD / order=unformatted;
  class Arrears_Band_CD / order=unformatted;
  var balance;
  table  current_date, prior_Arrears_Band_CD='', Arrears_Band_CD*(n balance*sum)/ box=prior_Arrears_Band_CD printmiss;
  Title 'Succession of O/S Balance ($ and %) and Accounts (#)';
run;

View solution in original post


All Replies
Respected Advisor
Posts: 3,893

Re: Help to convert this hard coding to a macro

The reason for this error is that you the 3rd argument in a substr() function defines the length of the substring you want to read and not the end position.

substr(<source string>,<start position>,<length>).

%substr(&t_1,5,8); instructs SAS to read string  '1FEB2010'd starting from position 5 and then reading the next 8 characters. Your string is not long enough for this so you're getting an error.

I assume that's what you're actually after: Cycle_%substr(&t_1,3,3)_TO_%substr(&t_2,4,3)_%substr(&t_1,6,4)

In case you just want to split up a large data set into smaller datasets based on date ranges: I can think of more straightforward ways of doing this. Is this what you want to do and are your cycles just consecutive 2 month ranges?

I don't know why you need to split up your data into several smaller datasets. It is often better/easier to code if you leave everything in one big dataset and then subset the data later on as you need it or to use by group processing in procedures (instead of calling them x-times with different datasets).

You could for examle create a cycle date variable which you then use in the by group (or class statement) of procedures. Just to give you the general idea of how to create such a group variable below some sample code:

data RAW_DATA_SET_3_sorted;
  format current_date date9.;
  do current_date='01Jan2010'd to '3Sep2012'd by 27;
    output;
  end;
run;


data want;
  set RAW_DATA_SET_3_sorted;
  by current_date;
  format CycleStartDate CycleEndDate date9.;
  length CycleDSname $21;
  CycleStartDate=intnx('month2.2',current_date,0,'b');
  CycleEndDate=intnx('month2.2',current_date,0,'e');
  CycleDSname=
    cats('Cycle_',put(CycleStartDate,monname3.),'_To_'
          ,put(CycleEndDate,monname3.),'_',year(CycleStartDate)
        );
run;

PROC Star
Posts: 7,363

Re: Help to convert this hard coding to a macro

Mirisage,

Does the following do what you are trying to accomplish?:

data RAW_DATA_SET_3_sorted;

  input current_date date9.;

  cards;

1jan2010

10jan2010

15feb2010

31mar2010

1apr2010

;

%let t_1=1FEB2010;

%let t_2=31MAR2010;

data Cycle_%sysfunc(substr(&t_1,%sysfunc(anyalpha(&t_1)),3))_TO_%sysfunc(substr(&t_2,%sysfunc(anyalpha(&t_2)),3))_%sysfunc(substr(&t_1,%sysfunc(anyalpha(&t_1))+3,4));

  set RAW_DATA_SET_3_sorted;

  if "&t_1"d le current_date le "&t_2"d;

run;

Super Contributor
Posts: 338

Re: Help to convert this hard coding to a macro

Hi Patrick and Art,

Many thanks to both of you.

I think this is the good time to show how inefficiently I have achieved what I wanted to achieve. 

Using the data set (shown at the very bottom, can be run readily), I wanted to create 23 Markov Matrices/Tables for consecutive months, commencing Feb 2010 to Mar 2010 pair, culminating Dec 2011 to Jan 2012 pair.

/*This is the first Table I generated using my novice code provided further down*/

Succession of O/S Balance ($ and %) and Accounts (#) from Feb 2010 to Mar 2010

prior_arrears_band

Areeras_Band

Current

60 - 90

N

Balance

Balance

N

Balance

Balance

Sum

%

Sum

%

Current

2

1571

?

.

.

?

1 - 30

.

.

?

1

800

?

/*This is the second Table I generated*/

Succession of O/S Balance ($ and %)) and Accounts (#) from Mar 2010 to Apr 2010

prior_arrears_band

Areeras_Band

Current

60 - 90

N

Balance

Balance

N

Balance

Balance

Sum

%

Sum

%

30 - 60

1

200

?

.

.

?

NPNA

.

.

?

1

30

?

Next, ..... Apr 2010 to May 2010 table

May 2010 to Jun 2010 table

Jun  2010 to Jul  2010 table

……………………………………………..

…………………………………………………

……………………………………………….

Dec 2011 to Jan 2012 table (this is the last table)

This is the inefficient (but hopefully correct) approach I have taken. (Data set is provided at the very bottom that can be readily run).

/*Sorting*/

proc sort data= RAW_DATA_SET_3 out=RAW_DATA_SET_3_sorted;

   by bank_number account_number current_date;

run;

/*Isolating only accounts occur any day between 1Feb2010 and 31Mar2010 (consecutive pairs of 2 months). Note that some accounts may not occur in pairs*/

data Cycle_FEB_TO_MAR_2010;

set RAW_DATA_SET_3_sorted;

   if '1FEB2010'd le current_date le '31MAR2010'd;

run;

/*sorting again*/

proc sort data=Cycle_FEB_TO_MAR_2010 out= Cycle_FEB_TO_MAR_2010_;

by bank_number account_number current_date;

run;

/*Getting rid of single records. In other words, isolating only consecutive pairs */

data Cycle_FEB_TO_MAR_2010_iso;

set Cycle_FEB_TO_MAR_2010_;

by bank_number account_number current_date;

if not (first.account_number and last.account_number);

run;

/*getting the preceeding month's arrears_band into the succeeding month's row*/

data Cycle_FEB_TO_MAR_2010_iso;

set Cycle_FEB_TO_MAR_2010_iso;

     prior_arrears_band = lag( arrears_band );

run;

/*Removing Preceeding month rows, or in other words, retaining succeeding month's row*/

/*This retained row now has both preceeding month's arrers band and succeeding month's

arrears band*/

data Cycle_FEB_TO_MAR_2010_iso_;

   set Cycle_FEB_TO_MAR_2010_iso;

   if '1MAR2010'd le current_date le '31MAR2010'd;

run;

/*User defined format to control the column and row title orders*/

proc format ;

value $ fmt (multilabel notsorted)

  'Current'='Current'

  '1 - 30'='1 - 30'

  '30 - 60'='30 - 60'

  '60 - 90'='60 - 90'

  '90 +' = '90 +' 

  'NPNA'='NPNA'

  'writoff' = 'writoff'

;

run;

/*This proc tabulate works but row title order not work*/

proc tabulate data= Cycle_FEB_TO_MAR_2010_iso_ order=formatted;

   class prior_arrears_band /PRELOADFMT order=data; /*row CLASS statement*/

   class arrears_band /PRELOADFMT order=data; /*column CLASS statement*/

   var balance;

table prior_arrears_band='', arrears_band*(n balance*sum)/ box=prior_arrears_band;

format arrears_band $fmt.; /*applying the earlier defined format*/

Title 'Succession of O/S Balance ($ and %) and Accounts (#) from Feb 2010 to Mar 2010';

run;

This is the final first table I generated (except balance % columns) using my novice code.

Succession of O/S Balance ($ and %) and Accounts (#) from Feb 2010 to Mar 2010

prior_arrears_band

Arreras_Band

Current

60 - 90

N

Balance

Balance

N

Balance

Balance

Sum

%

Sum

%

Current

2

1571

?

.

.

?

1 - 30

.

.

?

1

800

?

Above markov table says:

  • 2 accounts have cycled from “Current” bucket in FEB2010 (preceeding month) to “Current” bucket in MAR2010 (succeeding month).

   

  • The corresponding $ amount is $1,571 (i.e. 666+905)

  • 1 account has cycled from “1-30” bucket in FEB2010 (preceeding month) to “60-90” bucket in MAR2010 (succeeding month).

   

  • The corresponding $ amount is $800 

Next I have to repeat this exercise for the Mar 2010 to Apr 2010 cycle and so on, culminating Dec 2011 to Jan 2012 cycle.

I know no body can read all these things, but appreciate if any help to make my code efficent, if time permits.

This is dataset used.

data RAW_DATA_SET_3;

inFormat Current_date date9.;

input Bank_number Account_number $ 4-12    Current_date    Product    $ 24-36 Balance    Arrears_Band $43-49;

format Current_date date9.;

cards;

10 1111111   31MAR2010 Personal Loan 555  30 - 60

10 1111111   30APR2010 Personal Loan 200  Current

10 1111111   31MAY2010 Personal Loan 108  1 - 30

30 1111111   31DEC2010 Res. Mortgage 800  Current

30 1111111   31JAN2011 Res. Mortgage 600  90 +

30 1111111   28FEB2011 Res. Mortgage 600  NPNA

30 1111111   17MAR2011 Res. Mortgage 600  writoff

20 333333333 15MAR2010 Personal OD   700  NPNA

20 333333333 30APR2010 Personal OD   30   NPNA

20 333333333 31MAY2010 Personal OD   2    NPNA

20 333333333 30JUN2010 Personal OD   1    NPNA

10 444444444 28FEB2010 Personal Loan 900  1 - 30

10 444444444 31MAR2010 Personal Loan 800  60 - 90

10 444444444 31MAY2010 Personal Loan 20   NPNA

10 444444444 01JUL2010 Personal Loan 25   NPNA

10 444444444 30AUG2010 Personal Loan 32   NPNA

70 666666666 07FEB2010 Personal OD   999  Current

70 666666666 15MAR2010 Personal OD   666  Current

40 777777    08FEB2010 Res. Mortgage 1877 Current

40 777777    28MAR2010 Res. Mortgage 905  Current

80 777777    01FEB2010 Personal Loan 3000 90 +

;

run;

 

Thanks

Mirisae

Solution
‎09-30-2012 09:18 PM
Respected Advisor
Posts: 3,893

Re: Help to convert this hard coding to a macro

I wasn't sure of how to calculate "Balance %" (percent of which total?) but else below code should do what you asked for. You might want to give the datasets different names now that everything is done in one go.

data RAW_DATA_SET_3;
  inFormat Current_date date9.;
  input Bank_number Account_number $ 4-12 Current_date Product $ 24-36 Balance Arrears_Band $43-49;
  format Current_date date9.;
  cards;
10 1111111   31MAR2010 Personal Loan 555  30 - 60
10 1111111   30APR2010 Personal Loan 200  Current
10 1111111   31MAY2010 Personal Loan 108  1 - 30
30 1111111   31DEC2010 Res. Mortgage 800  Current
30 1111111   31JAN2011 Res. Mortgage 600  90 +
30 1111111   28FEB2011 Res. Mortgage 600  NPNA
30 1111111   17MAR2011 Res. Mortgage 600  writoff
20 333333333 15MAR2010 Personal OD   700  NPNA
20 333333333 30APR2010 Personal OD   30   NPNA
20 333333333 31MAY2010 Personal OD   2    NPNA
20 333333333 30JUN2010 Personal OD   1    NPNA
10 444444444 28FEB2010 Personal Loan 900  1 - 30
10 444444444 31MAR2010 Personal Loan 800  60 - 90
10 444444444 31MAY2010 Personal Loan 20   NPNA
10 444444444 01JUL2010 Personal Loan 25   NPNA
10 444444444 30AUG2010 Personal Loan 32   NPNA
70 666666666 07FEB2010 Personal OD   999  Current
70 666666666 15MAR2010 Personal OD   666  Current
40 777777    08FEB2010 Res. Mortgage 1877 Current
40 777777    28MAR2010 Res. Mortgage 905  Current
80 777777    01FEB2010 Personal Loan 3000 90 +
;
run;

/* create format for date cycle */

proc sql;

  create view v_cyclefmt as

    select distinct

      'Cycle2Month' as fmtname

      ,'n' as type

      ,current_date as start

      ,catx(' ',put(intnx('month',current_date,-1,'b'),monname.),put(intnx('month',current_date,-1,'b'),year4.),'to',put(current_date,monname.)) as label

    from RAW_DATA_SET_3

  ;

quit;

proc format cntlin=v_cyclefmt;
run;

/* create format for sort order of Arrears_Band  */
proc format ;
  invalue ArrearsCode
    'CURRENT' =1
    '1-30'    =2
    '30-60'   =3
    '60-90'   =4
    '90+'     =5
    'NPNA'    =6
    'WRITOFF' =7
    OTHER     =99
  ;
  value ArrearsText
    1='Current'
    2='1 - 30'
    3='30 - 60'
    4='60 - 90'
    5='90 +'
    6='NPNA'
    7='writoff'
    99='misscoded!'
  ;
run;

/*Sorting*/
proc sort data= RAW_DATA_SET_3 out=RAW_DATA_SET_3_sorted;
  by bank_number account_number current_date;
run;


data Cycle_FEB_TO_MAR_2010_iso_;
  set RAW_DATA_SET_3_sorted;
  by bank_number account_number current_date;
  Arrears_Band_CD=input(upcase(compress(Arrears_Band)),ArrearsCode.);
  prior_Arrears_Band_CD = lag( Arrears_Band_CD );
  if intck('month',lag(current_date),current_date) ne 1 then call missing(prior_Arrears_Band_CD);

  if first.account_number or missing(prior_Arrears_Band_CD) then delete;
run;

options missing=' ';
proc tabulate data= Cycle_FEB_TO_MAR_2010_iso_ order=formatted;
  format current_date Cycle2Month. Arrears_Band_CD prior_Arrears_Band_CD ArrearsText.;
  label current_date='Cycle';
  class current_date / order=unformatted ;
  class prior_Arrears_Band_CD / order=unformatted;
  class Arrears_Band_CD / order=unformatted;
  var balance;
  table  current_date, prior_Arrears_Band_CD='', Arrears_Band_CD*(n balance*sum)/ box=prior_Arrears_Band_CD printmiss;
  Title 'Succession of O/S Balance ($ and %) and Accounts (#)';
run;

Super Contributor
Posts: 338

Re: Help to convert this hard coding to a macro

Hi Patrick,

This is a tremondous help as well as a very useful new knowledge for me.

Thank you very much again!.

Regards

Mirisage

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 1336 views
  • 3 likes
  • 3 in conversation