- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Good afternoon.
I have the next 24 lentgh string like this:
hist_payment='012212201223456789999999'
Where every character represents the past due number for every single month, so the first number represents the most recent month and their past due number asociated to the number.
I have to define if a customer has been cured, and a cured customer is the one who has only 0, 1 or 2 in the first 12 characters (months) of the hist_payment string.
This is my work so far:
if input(first(compress('9876543210',substr(HIST_PAYMENT,1,12),'k')),15.) <3 then flag_cure=1; else flag_cure=0;
But is not working at all.
What I exactly need is to obeserve if the first 12 substring contains ONLY 0, 1 or 2, if this true then I can create the flag_cured as 1, otherwise the flag_cured is zero.
Please help, I've beeing trying to solve this problem for a long time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
if lengthn(compress(substr(hist_payment, 1,12),'012'))=0 then ...
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello!!
Thanks for the effor but is not doing what I want, actually it checked everything in 1 even if there is no info in the database.
Let me put some examples:
ACCT ACCT_HIST_PAYMET HIST_PAYMENT_201803
10000000523959 543232212100000000000000
10000000654614 543211321654321000000000 999999999998765000000000
10000004621718 544322111118777000000000
10000004650329 544433221221113000000000 321212221111111000000000
10000005537491 543211232122112211212221 100112112213221121321176
NOTE: the last customer is complete CURED.
NOTE: any historical payment it should be read from left to right meaning the first character represent the actual momento and from left to right the second represents the last month and so on.
The first coulmn represents the actual account number of the customer, the second column represent the historical behavior of payments which is 24 length and character and its view at 201603, the third column explains the historical behavior of payment at March201803 (I have every single historical payment for every month starting in March2016).
The third column can be missing meaning the customer is not longer (or at that time) alive or active in my database.
If I create a substring for hist_payment201803 I obtain: '321212221111' and I can conclude that this customer is not cured, because in the first number (which explains the behavior payment of March2018) is 3, so this customer is not cured.
The verify option, or the countc option only finds the exact chain of '012' but what I really want is to verify is of the substring chain contain any kind of combination of 0 or 1 or 2, e.g.:
'000000000000'----> This customer is cured
'000000000001'----> This customer is cured
'000021000110'----> This customer is cured
'111111111111'----> This customer is cured
'222222222222'----> This customer is cured
'012012012012'----> Thus customer is cured
'322122111111'----> THIS CUSTOMER IS NOT CURED because it contains a 3 and the substring can only contains 0 or 1 or 2, but with your example you're only looking for the exact chain of '012'.
May I please have any other idea?
If my explain is not complete please tell me.
I do appreciate your help!!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here goes my share of fun
data want;
set have;
if countc(substr(hist_payment,1,12),'012')=12 then flag_cure=1;
else flag_cure=0;
run;
/*Or*/
Using Boolean expression
data want;
set have;
flag_cure= countc(substr(hist_payment,1,12),'012')=12 ;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello!!
Thanks for the effor but is not doing what I want, actually it checked everything in 1 even if there is no info in the database.
Let me put some examples:
ACCT ACCT_HIST_PAYMET HIST_PAYMENT_201803
10000000523959 543232212100000000000000
10000000654614 543211321654321000000000 999999999998765000000000
10000004621718 544322111118777000000000
10000004650329 544433221221113000000000 321212221111111000000000
10000005537491 543211232122112211212221 100112112213221121321176
NOTE: the last customer is complete CURED.
NOTE: any historical payment it should be read from left to right meaning the first character represent the actual momento and from left to right the second represents the last month and so on.
The first coulmn represents the actual account number of the customer, the second column represent the historical behavior of payments which is 24 length and character and its view at 201603, the third column explains the historical behavior of payment at March201803 (I have every single historical payment for every month starting in March2016).
The third column can be missing meaning the customer is not longer (or at that time) alive or active in my database.
If I create a substring for hist_payment201803 I obtain: '321212221111' and I can conclude that this customer is not cured, because in the first number (which explains the behavior payment of March2018) is 3, so this customer is not cured.
The verify option, or the countc option only finds the exact chain of '012' but what I really want is to verify is of the substring chain contain any kind of combination of 0 or 1 or 2, e.g.:
'000000000000'----> This customer is cured
'000000000001'----> This customer is cured
'000021000110'----> This customer is cured
'111111111111'----> This customer is cured
'222222222222'----> This customer is cured
'012012012012'----> Thus customer is cured
'322122111111'----> THIS CUSTOMER IS NOT CURED because it contains a 3 and the substring can only contains 0 or 1 or 2, but with your example you're only looking for the exact chain of '012'.
May I please have any other idea?
If my explain is not complete please tell me.
I do appreciate your help!!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The verify() function does exactly what you're asking for.
data _null_;
/* 3 on position 12, verify returns 12 */
hist_payment='012212201223456789999999';
check= verify(substrn(hist_payment,1,12),'012');
put check=;
/* only 0,1 or 2 in first 12 positions, verify returns 0 */
hist_payment='012212201220034567899999';
check= verify(substrn(hist_payment,1,12),'012');
put check=;
stop;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sir, Thank you for the great reminder of VERIFY(). Nice
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This piece of code returns CURED=1 if the customer is cured and CURED=0 otherwise:
data _null_ ; input hist_payment $24. ; cured = findc (hist_payment, "012", "k") > 12 ; put hist_payment cured= ; cards ; 210120102010508172635401 012212201220908172635401 123456789012345678901234 120120070220123456789012 ; run ;
This will print in the log:
210120102010508172635401 cured=1 012212201220908172635401 cured=1 123456789012345678901234 cured=0 120120070220123456789012 cured=0
Alternatively, the expression for CURED can be coded as:
cured = findc (hist_payment, "3456789") > 12 ;
Though the most concise and perhaps the cleanest is using the VERIFY function, as suggested by @Patrick:
cured = verify (hist_payment, "012") > 12 ;
Of course, in this context VERIFY is equivalent to FINDC with the K modifier (minus the cornucopia of the bells and whistles of the latter). For some odd reason, I have always found VERIFY kind of counter-intuitive - perhaps due to its negative logic "find the first position of any character NOT present in the specified list". Logically, the same is true for FINDC with the K modifier; and yet, it feels less counter-intuitive to me - maybe because in this case that same NOT (in the form of the K modifier) must be specified explicitly, while with VERIFY it is implicit.
Kind regards
Paul D.