My approach would be:
data example; input characterfield $; isnumber = not missing(input(characterfield,?? 32.)); datalines; 3.22 4..2 N/A 1.23e7 ;
Which gets 1 for is a number or 0 for not. I really dislike character Y/N as pretty much requires an if/then to assign and is much harder to count for reporting. Assign a custom format that will show Y for 1 and N for 0 if needed for boss.
The ?? in the INPUT function call suppresses invalid data messages that would otherwise appear in the log for the N/A and 4..2 and similar.
Note that I included a scientific notation to show that SAS will accept such as a number. If you have things that might look like that and should not be a number then the problem needs some expansion in description.
@DavidPhillips2 wrote:
I'm working on a data integrity check to validate if a character field only contains values that are numbers. The goal is to populate have.isnumber with values Y or N depending on if the value is a number or not. I caught instances where there are two decimals in the character field rather than one. E.g. 3..14 instead of 3.14. Is there a way to flag this in a data step?proc sql;create table have(characterField varchar2(10));insert into have (characterField) values ('3.22');insert into have (characterField) values ('4..22');insert into have (characterField) values ('N/A');quit;data want; set have;if characterField = '3.22' then IsNumber = 'Y';if characterField = '4..22' then IsNumber = 'N';if characterField = 'N/A' then IsNumber = 'N';run;
My approach would be:
data example; input characterfield $; isnumber = not missing(input(characterfield,?? 32.)); datalines; 3.22 4..2 N/A 1.23e7 ;
Which gets 1 for is a number or 0 for not. I really dislike character Y/N as pretty much requires an if/then to assign and is much harder to count for reporting. Assign a custom format that will show Y for 1 and N for 0 if needed for boss.
The ?? in the INPUT function call suppresses invalid data messages that would otherwise appear in the log for the N/A and 4..2 and similar.
Note that I included a scientific notation to show that SAS will accept such as a number. If you have things that might look like that and should not be a number then the problem needs some expansion in description.
@DavidPhillips2 wrote:
I'm working on a data integrity check to validate if a character field only contains values that are numbers. The goal is to populate have.isnumber with values Y or N depending on if the value is a number or not. I caught instances where there are two decimals in the character field rather than one. E.g. 3..14 instead of 3.14. Is there a way to flag this in a data step?proc sql;create table have(characterField varchar2(10));insert into have (characterField) values ('3.22');insert into have (characterField) values ('4..22');insert into have (characterField) values ('N/A');quit;data want; set have;if characterField = '3.22' then IsNumber = 'Y';if characterField = '4..22' then IsNumber = 'N';if characterField = 'N/A' then IsNumber = 'N';run;
Thanks for the one-line solution.
If some of your suspect values might be currency, have thousands separators (12,345,678) or percentages you might want to use the COMMA32 informat.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.