BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
DavidPhillips2
Rhodochrosite | Level 12
I'm working on a data integrity check to validate if a character field only contains values that are numbers.  The goal is to populate have.isnumber with values Y or N depending on if the value is a number or not.  I caught instances where there are two decimals in the character field rather than one.  E.g. 3..14 instead of 3.14.  Is there a way to flag this in a data step?  
 
proc sql;
create table have 
(
characterField varchar2(10)
);
insert into have (characterField) values ('3.22');
insert into have (characterField) values ('4..22');
insert into have (characterField) values ('N/A');
quit;
 
data want; set have;
if characterField = '3.22' then IsNumber = 'Y';
if characterField = '4..22' then IsNumber = 'N';
if characterField = 'N/A' then IsNumber = 'N';
run;
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

My approach would be:

data example;
  input characterfield $;
  isnumber = not missing(input(characterfield,?? 32.));
datalines;
3.22
4..2
N/A
1.23e7
;

Which gets 1 for is a number or 0 for not. I really dislike character Y/N as pretty much requires an if/then to assign and is much harder to count for reporting. Assign a custom format that will show Y for 1 and N for 0 if needed for boss.

 

The ?? in the INPUT function call suppresses invalid data messages that would otherwise appear in the log for the N/A and 4..2 and similar.

Note that I included a scientific notation to show that SAS will accept such as a number. If you have things that might look like that and should not be a number then the problem needs some expansion in description.

 


@DavidPhillips2 wrote:
I'm working on a data integrity check to validate if a character field only contains values that are numbers.  The goal is to populate have.isnumber with values Y or N depending on if the value is a number or not.  I caught instances where there are two decimals in the character field rather than one.  E.g. 3..14 instead of 3.14.  Is there a way to flag this in a data step?  
 
proc sql;
create table have 
(
characterField varchar2(10)
);
insert into have (characterField) values ('3.22');
insert into have (characterField) values ('4..22');
insert into have (characterField) values ('N/A');
quit;
 
data want; set have;
if characterField = '3.22' then IsNumber = 'Y';
if characterField = '4..22' then IsNumber = 'N';
if characterField = 'N/A' then IsNumber = 'N';
run;

 

View solution in original post

3 REPLIES 3
ballardw
Super User

My approach would be:

data example;
  input characterfield $;
  isnumber = not missing(input(characterfield,?? 32.));
datalines;
3.22
4..2
N/A
1.23e7
;

Which gets 1 for is a number or 0 for not. I really dislike character Y/N as pretty much requires an if/then to assign and is much harder to count for reporting. Assign a custom format that will show Y for 1 and N for 0 if needed for boss.

 

The ?? in the INPUT function call suppresses invalid data messages that would otherwise appear in the log for the N/A and 4..2 and similar.

Note that I included a scientific notation to show that SAS will accept such as a number. If you have things that might look like that and should not be a number then the problem needs some expansion in description.

 


@DavidPhillips2 wrote:
I'm working on a data integrity check to validate if a character field only contains values that are numbers.  The goal is to populate have.isnumber with values Y or N depending on if the value is a number or not.  I caught instances where there are two decimals in the character field rather than one.  E.g. 3..14 instead of 3.14.  Is there a way to flag this in a data step?  
 
proc sql;
create table have 
(
characterField varchar2(10)
);
insert into have (characterField) values ('3.22');
insert into have (characterField) values ('4..22');
insert into have (characterField) values ('N/A');
quit;
 
data want; set have;
if characterField = '3.22' then IsNumber = 'Y';
if characterField = '4..22' then IsNumber = 'N';
if characterField = 'N/A' then IsNumber = 'N';
run;

 

DavidPhillips2
Rhodochrosite | Level 12

Thanks for the one-line solution.

ballardw
Super User

If some of your suspect values might be currency, have thousands separators (12,345,678) or percentages you might want to use the COMMA32 informat.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 556 views
  • 0 likes
  • 2 in conversation