hello,
I'm seeking guidance toward building code that can modify aspects of a date, specifically the customer's date of birth (dob).The purpose is to pinpoint potential fraud (as a dob typically never changes, unless due to data entry error, or fraud).
Example variables: old_birthday: 01jan1980 | new_birthday: 03mar1982.
Format of birthday is date9., using SAS 9.4. I need to flag those where the change in the dob is low risk, such as those meeting the two below conditions:
From first variable (old_birthday):
A-Reverse last two digit of year. Example: have 05jan1967 want 05jan1976
B-Swap day with month. Example: have 05jan1980 want 01may1980
If want=new_birthday then do;High_Risk=’N’;end;
Any suggestions are highly appreciated, thanks in advance,
Test against dateA and dateB given by:
data _null_;
do date = '05jan1967'd,'05jan1980'd;
dateA = mdy(month(date), day(date),
100*int(year(date)/100) + 10*mod(year(date), 10) + mod(int(year(date)/10), 10));
dateB = mdy(day(date), month(date), year(date));
put (date datea dateb) (=yymmdd10.);
end;
run;
Test against dateA and dateB given by:
data _null_;
do date = '05jan1967'd,'05jan1980'd;
dateA = mdy(month(date), day(date),
100*int(year(date)/100) + 10*mod(year(date), 10) + mod(int(year(date)/10), 10));
dateB = mdy(day(date), month(date), year(date));
put (date datea dateb) (=yymmdd10.);
end;
run;
Put it back to Char, then we are playing a Char game, this is just one of the possible solutions,
data have;
format old_dt new_dt date9.;
old_dt='05jan1969'd;new_dt='5jan1996'd;output;
old_dt='5jan1980'd;new_dt='1may1980'd;output;
run;
data want;
set have;
_old=put(old_dt,yymmdd8.);
_new=put(new_dt,yymmdd8.);
if _old ne _new then do;
if _new=prxchange('s/(^\d)(\d)/$2$1/oi',-1,_old) then flag_a=1;
if _new=prxchange('s/(-\d\d)(-\d\d$)/$2$1/oi',-1,_old) then flag_b=1;
end;
run;
very interesting! Could you help me add 3 more flags to your code?
-if difference in old date and new date, is either -1, or 1 in either day month or year. Here's the code I had created:
dd=day(new_birthday)-day(old_birthday); mm=month(new_birthday)-month(old_birthday); yy=year(new_birthday)-year(old_birthday); if dd in (-1,1) then do; FLAG_D=1;END; IF MM in (-1,1) then do; FLAG_M=1;END; IF YY in (-1,1) then do; FLAG_Y=1;END;
i'm fairly new to sas.. thank you
I think this code checks if the new and the old days, months and years are within 1 of each other. ie. Its checking for a different kind of typo than requested in the original post (ie. typing 7 when you meant 😎 but still very relevant as it's likely to be low risk
correct, it will be included...
If you are loking for typos, you could use the edit distance with a small cutoff value:
match = complev(put(date1, yymmdd10.), put(date2, yymmdd10.), 1) <= 1;
Look at the description of the complev distance function in the documentation.
*set everything that has different birthdays as high risk;
if old_birthday -new_birthday>0 then flag="highrisk";
else flag="lowrisk";
*set those with reversed day months back to low risk;
if (month(new_birthday)=day(old_birthday) or day(new_birthday)=month(old_birthday) then flag="lowrisk";
*separate out the last and second last digits in year;
yearold=year(old_birthday);
yearnew=year(new_birthday);
yearlastdigitnew= yearnew-int(yearnew/10)*10;
yearlastdigitold=yearold-int(yearold/10)*10;
year2ndlastdigitnew=int(yearnew/10)-int(yearnew/100)*10;
year2ndlastdigitold=int(yearold/10)-int(yearold/100)*10;
*set those with reverse last year digits to low risk;
if (yearlastdigitnew=year2ndlastdigitold) or (yearlastdigitold=year2ndlastdigitnew) then flag="lowrisk";
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.