BookmarkSubscribeRSS Feed
seohyeonjeong
Obsidian | Level 7

Hi everyone,

 

I want to find an easy way to remove rows with non-numeric values in a column.

I only have to write the code in PROC SQL and there are non-English characters in the values.

 

For example,

 

[Current data] - real codes have a lot of rows..

custNo, rate

1000    0.6            

1002  0.6    

1004  0.8

1005  確認要

1007  案内通り

 

[Result Data]

custNo, rate

1000    0.6            

1002  0.6    

1004  0.8

 

I want to delete rows with non-numeric values in the rate column,

cuz I have to convert values in the "rate" to numeric values for making a formula.

 

If I get an answer, I will appreciate it.

 

Thanks

 

4 REPLIES 4
PaigeMiller
Diamond | Level 26
proc sql;
    create table want as select * from have
    where not missing(input(rate,best10.));
quit;
--
Paige Miller
Amir
PROC Star

Hi @seohyeonjeong,

 

The below code uses the kcompress() function to remove any "-", "." and digits ("d") in the variable rate, and if there is nothing left then the record is output. This obviously assumes you don't have any invalid rates values such as "---", or ".....", or "-.-.-", etc.:

 

proc sql noprint;
	create table want as
	select *
	from have
	where kcompress(rate, '-.','d') is missing
	;
quit;

 

If you choose to use the input() function (as per @PaigeMiller's post) then I recommend you use the "?" modifier as per the documentation, for example:

 

proc sql;
    create table want as select * from have
    where not missing(input(c_height,? best10.));
quit;

 

 

 

Kind regards,

Amir.

 

Edit: Replaced references to compress() with kcompress(), as per post from @Patrick.

Patrick
Opal | Level 21

@Amir You need to use kcompress() and not compress() with MBCS data or the results can become "unexpected".

Tom
Super User Tom
Super User

Why remove them? Why not just convert the values into a number and that then values that are invalid as a number will have missing values and be ignored when calculating means and other statistics.

The ?? will suppress the errors when invalid strings are found. 32 is the maximum width the normal numeric informat supports. The INPUT() function does not care if the string being converted is shorter than the width used in the INFORMAT being used to convert it.

data want;
  set have;
  rate_n = input(rate,??32.);
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 4078 views
  • 8 likes
  • 5 in conversation