My approach would be:
data example; input characterfield $; isnumber = not missing(input(characterfield,?? 32.)); datalines; 3.22 4..2 N/A 1.23e7 ;
Which gets 1 for is a number or 0 for not. I really dislike character Y/N as pretty much requires an if/then to assign and is much harder to count for reporting. Assign a custom format that will show Y for 1 and N for 0 if needed for boss.
The ?? in the INPUT function call suppresses invalid data messages that would otherwise appear in the log for the N/A and 4..2 and similar.
Note that I included a scientific notation to show that SAS will accept such as a number. If you have things that might look like that and should not be a number then the problem needs some expansion in description.
@DavidPhillips2 wrote:
I'm working on a data integrity check to validate if a character field only contains values that are numbers. The goal is to populate have.isnumber with values Y or N depending on if the value is a number or not. I caught instances where there are two decimals in the character field rather than one. E.g. 3..14 instead of 3.14. Is there a way to flag this in a data step?proc sql;create table have(characterField varchar2(10));insert into have (characterField) values ('3.22');insert into have (characterField) values ('4..22');insert into have (characterField) values ('N/A');quit;data want; set have;if characterField = '3.22' then IsNumber = 'Y';if characterField = '4..22' then IsNumber = 'N';if characterField = 'N/A' then IsNumber = 'N';run;
My approach would be:
data example; input characterfield $; isnumber = not missing(input(characterfield,?? 32.)); datalines; 3.22 4..2 N/A 1.23e7 ;
Which gets 1 for is a number or 0 for not. I really dislike character Y/N as pretty much requires an if/then to assign and is much harder to count for reporting. Assign a custom format that will show Y for 1 and N for 0 if needed for boss.
The ?? in the INPUT function call suppresses invalid data messages that would otherwise appear in the log for the N/A and 4..2 and similar.
Note that I included a scientific notation to show that SAS will accept such as a number. If you have things that might look like that and should not be a number then the problem needs some expansion in description.
@DavidPhillips2 wrote:
I'm working on a data integrity check to validate if a character field only contains values that are numbers. The goal is to populate have.isnumber with values Y or N depending on if the value is a number or not. I caught instances where there are two decimals in the character field rather than one. E.g. 3..14 instead of 3.14. Is there a way to flag this in a data step?proc sql;create table have(characterField varchar2(10));insert into have (characterField) values ('3.22');insert into have (characterField) values ('4..22');insert into have (characterField) values ('N/A');quit;data want; set have;if characterField = '3.22' then IsNumber = 'Y';if characterField = '4..22' then IsNumber = 'N';if characterField = 'N/A' then IsNumber = 'N';run;
Thanks for the one-line solution.
If some of your suspect values might be currency, have thousands separators (12,345,678) or percentages you might want to use the COMMA32 informat.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.