Hi Everyone,
I have a business problem where I wanted to perform patter check on the column.
I wanted to check if the values are in a particular format like decimal20.5. It should fail if the format is different from it.
data have;
input col1;
datalines;
0.01238788
61.15612
123.12
12.1244
12345.125665
;
run;
- > using below solution:
%let pattern=^[0-9]\d{1,20}(\.\d{1,10})?%?$;
data want;
set have;
pattern_val=pxparse("/&pattern./");
v1=prxmatch(%nrbquote(pattern_val),col1);
if prxmatch(%nrbquote(pattern),col1) gt 0 then pass=0;
else pass=1;
run;
0- means the value is in decimal 20.5 format
1- other than that
Note:
I am not much familiar with regex and took the expression from internet.
I have to use the snippet in my other macro codes so need to make it dynamic with pattern code.
Any help is very much appreciated.
Thanks,
No too sure why you use macro functions.
Also note that [0-9] is the same as \d
This is simpler:
%let pattern=^\d{1,20}(\.\d{1,10})?%?$;
data WANT;
set HAVE;
PASS = prxmatch("/&pattern/",strip(COL1)) > 0;
run;
What doesn't work?
Lastly, Use the appropriate icon to paste your SAS code.
Regular expressions are for strings, not numbers.
Did you mean to create your original COL1 variable as a character string?
data have;
input col1 $40. ;
datalines;
0.01238788
61.15612
123.12
12.1244
12345.125665
;
What do you think the regular expression you have means?
^[0-9]\d{1,20}(\.\d{1,10})?%?$
What does DECIMAL20.5 mean? Is that like the SAS format 20.5 which means 20 character long with the last 6 being the decimal point and 5 digits to the right of the decimal point?
@msr1 wrote:
Yes it is format 20.5 with 5 digits after decimal.
I am not sure about the regex, I took it from the internet.
My concern is, I need to perform validation on my data, and need to verify it all the values are in format 20.5, anything apart from that will fail.
Its not on one table, have to do it for multiple table
Are your fields numeric? Is the 20.5 the SAS format or the Teradata field definition. They are different. A DECIMAL(20,5) field in Teradata has space for 20 digits of which 5 are after the decimal place. A SAS format of 20.5 has space for just 19 digits or which 5 are after the decimal place. Also note that SAS stores all numbers as floating point numbers and so it cannot store 30 or even 20 digits precisely. If you actually had a field defined as DECIMAL(30,5) in Teradata you would either need to convert it to FLOAT and lose some of the decimal digit precision or convert it to character and treat it as such in SAS.
To test if a number you have it SAS will "fit" into a specific DECIMAL field in Teradata you probably want to use arithmetic and not regular expressions to check. So if the maximum number of decimal places is 5 then round the value to that many decimal places. If the maximum number of decimal places to the left of the decimal place is 15 then make sure the magnitude of the number is smaller than 10**15.
If all you are wanting to do is flag values with values upto 5 decimal places, then below should suffice?
Also, the DATA step with PRX functions have errors (example pattern_val is listed as pattern in the IF statements).
data have;
input col1;
decimal = length(scan(strip(put(col1,best20.)),2,"."));
if decimal >= 5 then
pass = 1;
else pass=0;
drop decimal;
datalines;
0.01238788
61.15612
123.12
12.1244
12345.125665
;
run;
/* Do not need the below */
%let pattern=^[0-9]\d{1,20}(\.\d{1,10})?%?$;
data want;
set have;
pattern_val=prxparse("/&pattern./");
v1=prxmatch(%nrbquote(pattern_val),col1);
if prxmatch(%nrbquote(pattern_val),col1) gt 0 then
pass=0;
else pass=1;
run;
Okay see below that flags any value that is ^ length 20 and ^decimal 5:
data have;
input col1;
col = length(put(col1,best20.));
decimal = length(scan(strip(put(col1,best20.)),2,"."));
if col = 20 and decimal = 5 then
pass = 1;
else pass=0;
drop decimal col;
datalines;
0.01238788
61.15612
123.12
12.1244
12345.125665
;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.