BookmarkSubscribeRSS Feed
Aldo_Leal
Calcite | Level 5

Good afternoon.

 

I have the next 24 lentgh string like this:

 

hist_payment='012212201223456789999999'

 

Where every character represents the past due number for every single month, so the first number represents the most recent month and their past due number asociated to the number.

 

I have to define if a customer has been cured, and a cured customer is the one who has only 0, 1 or 2 in the first 12 characters (months) of the hist_payment string.

 

This is my work so far:

 

if input(first(compress('9876543210',substr(HIST_PAYMENT,1,12),'k')),15.) <3 then flag_cure=1;
else flag_cure=0;

But is not working at all.

 

What I exactly need is to obeserve if the first 12 substring contains ONLY 0, 1 or 2, if this true then I can create the flag_cured as 1, otherwise the flag_cured is zero.

 

Please help, I've beeing trying to solve this problem for a long time.

7 REPLIES 7
PaigeMiller
Diamond | Level 26
if lengthn(compress(substr(hist_payment, 1,12),'012'))=0 then ... 
--
Paige Miller
Aldo_Leal
Calcite | Level 5

Hello!!

 

Thanks for the effor but is not doing what I want, actually it checked everything in 1 even if there is no info in the database.

Let me put some examples:

 

ACCT                        ACCT_HIST_PAYMET               HIST_PAYMENT_201803

10000000523959      543232212100000000000000    

10000000654614      543211321654321000000000   999999999998765000000000

10000004621718      544322111118777000000000   

10000004650329      544433221221113000000000   321212221111111000000000

10000005537491      543211232122112211212221    100112112213221121321176

 

NOTE: the last customer is complete CURED.

NOTE: any historical payment it should be read from left to right meaning the first character represent the actual momento and from left to right the second represents the last month and so on.

 

The first coulmn represents the actual account number of the customer, the second column represent the historical behavior of payments which is 24 length and character and its view at 201603, the third column explains the historical behavior of payment at March201803 (I have every single historical payment for every month starting in March2016).

 

The third column can be missing meaning the customer is not longer (or at that time) alive or active in my database.

 

If I create a substring for hist_payment201803 I obtain: '321212221111' and I can conclude that this customer is not cured, because in the first number (which explains the behavior payment of March2018) is 3, so this customer is not cured.

 

The verify option, or the countc option only finds the exact chain of '012' but what I really want is to verify is of the substring chain contain any kind of combination of 0 or 1 or 2, e.g.:

 

'000000000000'----> This customer is cured

'000000000001'----> This customer is cured

'000021000110'----> This customer is cured

'111111111111'----> This customer is cured

'222222222222'----> This customer is cured

'012012012012'----> Thus customer is cured

'322122111111'----> THIS CUSTOMER IS NOT CURED because it contains a 3 and the substring can only contains 0 or 1 or 2, but with your example you're only looking for the exact chain of '012'.

 

May I please have any other idea?

 

If my explain is not complete please tell me.

 

I do appreciate your help!!!

novinosrin
Tourmaline | Level 20

Here goes my share of fun

 

data want;
set have;
if countc(substr(hist_payment,1,12),'012')=12 then flag_cure=1;
else flag_cure=0;
run;

/*Or*/

Using Boolean expression

data want;
set have;
flag_cure= countc(substr(hist_payment,1,12),'012')=12 ;
run;

 

Aldo_Leal
Calcite | Level 5

Hello!!

 

Thanks for the effor but is not doing what I want, actually it checked everything in 1 even if there is no info in the database.

Let me put some examples:

 

ACCT                        ACCT_HIST_PAYMET               HIST_PAYMENT_201803

10000000523959      543232212100000000000000    

10000000654614      543211321654321000000000   999999999998765000000000

10000004621718      544322111118777000000000   

10000004650329      544433221221113000000000   321212221111111000000000

10000005537491      543211232122112211212221    100112112213221121321176

 

NOTE: the last customer is complete CURED.

NOTE: any historical payment it should be read from left to right meaning the first character represent the actual momento and from left to right the second represents the last month and so on.

 

The first coulmn represents the actual account number of the customer, the second column represent the historical behavior of payments which is 24 length and character and its view at 201603, the third column explains the historical behavior of payment at March201803 (I have every single historical payment for every month starting in March2016).

 

The third column can be missing meaning the customer is not longer (or at that time) alive or active in my database.

 

If I create a substring for hist_payment201803 I obtain: '321212221111' and I can conclude that this customer is not cured, because in the first number (which explains the behavior payment of March2018) is 3, so this customer is not cured.

 

The verify option, or the countc option only finds the exact chain of '012' but what I really want is to verify is of the substring chain contain any kind of combination of 0 or 1 or 2, e.g.:

 

'000000000000'----> This customer is cured

'000000000001'----> This customer is cured

'000021000110'----> This customer is cured

'111111111111'----> This customer is cured

'222222222222'----> This customer is cured

'012012012012'----> Thus customer is cured

'322122111111'----> THIS CUSTOMER IS NOT CURED because it contains a 3 and the substring can only contains 0 or 1 or 2, but with your example you're only looking for the exact chain of '012'.

 

May I please have any other idea?

 

If my explain is not complete please tell me.

 

I do appreciate your help!!!

Patrick
Opal | Level 21

The verify() function does exactly what you're asking for.

data _null_;
  /* 3 on position 12, verify returns 12 */
  hist_payment='012212201223456789999999';
  check= verify(substrn(hist_payment,1,12),'012');
  put check=;
  /* only 0,1 or 2 in first 12 positions, verify returns 0 */
  hist_payment='012212201220034567899999';
  check= verify(substrn(hist_payment,1,12),'012');
  put check=;
  stop;
run;
novinosrin
Tourmaline | Level 20

Sir, Thank you for the great reminder of VERIFY(). Nice

hashman
Ammonite | Level 13

@Aldo_Leal:

This piece of code returns CURED=1 if the customer is cured and CURED=0 otherwise:

data _null_ ;                                    
  input hist_payment $24. ;                  
  cured = findc (hist_payment, "012", "k") > 12 ;
  put hist_payment cured= ;                      
  cards ;                                        
210120102010508172635401                         
012212201220908172635401                         
123456789012345678901234                         
120120070220123456789012                         
;                                                
run ; 

This will print in the log:

210120102010508172635401 cured=1
012212201220908172635401 cured=1
123456789012345678901234 cured=0
120120070220123456789012 cured=0

Alternatively, the expression for CURED can be coded as:

  cured = findc (hist_payment, "3456789") > 12 ;

Though the most concise and perhaps the cleanest is using the VERIFY function, as suggested by @Patrick:

  cured = verify (hist_payment, "012") > 12 ;

Of course, in this context VERIFY is equivalent to FINDC with the K modifier (minus the cornucopia of the bells and whistles of the latter). For some odd reason, I have always found VERIFY kind of counter-intuitive - perhaps due to its negative logic "find the first position of any character NOT present in the specified list". Logically, the same is true for FINDC with the K modifier; and yet, it feels less counter-intuitive to me - maybe because in this case that same NOT (in the form of the K modifier) must be specified explicitly, while with VERIFY it is implicit.

 

Kind regards

Paul D.

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 5360 views
  • 4 likes
  • 5 in conversation