BookmarkSubscribeRSS Feed
Phil_NZ
Barite | Level 11

Hi SAS Users,

 

Today I ran into a problem when trying to filter my data regarding keeping missing observation

 

I want to keep observation where:

 s22 >=0 

and (s21>=0 or s21 is a missing observation)

So my code is as below, can you please have a look and tell me if the code is acceptable

 

data want;
   set have;
   if (s21 ge 0 or s21 =".") and s22 ge 0;
run;

Many thanks!

 

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

If s21 is numeric, then drop the quotations around the dot.

 

Otherwise, all good 🙂

Phil_NZ
Barite | Level 11

Hi @PeterClemmensen !

 

Thank you for your answer, because I am not sure if "." is numeric or not because it is set up as the code below

data want;
      set have;
	  if input(s21, ?? 32.) = . then s21=.;
      else s21=input(s&i., 32.);
   run;

The code 

if input(s21, ?? 32.) = . then s21=.;

set all character values to missing  ".", I am not sure in that case, "." is numeric or character value?

 

Thank you!

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
PeterClemmensen
Tourmaline | Level 20

In the latest code you post, I assume that s21 is character because the Input Function takes character input. 

Phil_NZ
Barite | Level 11

Hi @PeterClemmensen 

 

I think the code

if input(s21, ?? 32.) = .

is to check if this observation of s21 is character

but 

then s21=.

is the one that assign the value for such character value.

So from my understanding, s21 at such character value are character or numeric depending on 

then s21=.

Please correct me if I explain improperly!

 

Thanks in advance.

 

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
PeterClemmensen
Tourmaline | Level 20

@Phil_NZ, you may be correct. Though, the easy way to find out is to run

 

proc contents data=have;
run;

, check the output and find out the type of s21. Then create your code accordingly 🙂

Phil_NZ
Barite | Level 11

Hi @PeterClemmensen 

 

I know a useful code proc contents from you, thank you very much, now the format of s21 is at below

My97_0-1611650472654.png

So, it is quite clear now, so I have a novice question that what is the type of a variable if some observations are numeric but some are characters?

 

Thanks

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
ballardw
Super User

Best is to check variable properties as in Proc Contents. If the variable is "num" it is numeric.

 

BTW, the actual format is important. Formats can display what appear to be characters for numeric values.

Consider this code:

 

data example;
   x=1;
   format x  words.;
run;

If you open the data set in a viewer or display value with a procedure like Proc Print you will see a value of "one" displayed. The actual numeric value is still 1 but it is displayed as word because of the format.

 

 

For a comparison or filter you should use: if x=1 then <do something>. Not: if x='one' then <do something>;

The second will generate a "character to numeric conversion" note in the log as SAS tries to turn 'one' into a numeric value to compare to the numeric X, generate an Invalid data when it fails to make that conversion and find that the conversion does not match the numeric value of 1 so is false as seen here:

9    data want;
10      set example;
11      if x='one' then put 'Found it';
12   run;

NOTE: Character values have been converted to numeric values at the places given by:
      (Line):(Column).
      11:9
NOTE: Invalid numeric data, 'one' , at line 11 column 9.
x=one _ERROR_=1 _N_=1
NOTE: There were 1 observations read from the data set WORK.EXAMPLE.
NOTE: The data set WORK.WANT has 1 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds

There are also "special missing" values that can be assigned. The values are not used in calculations because the underlying value is missing numerical observation. The values that can be used are .A to .Z and ._  . They can be used to indicate that you know why the value is missing so could branch code based on that value other than just "missing". Also the specific test for =.   does not return true for these:

data example2;
   x=1;
   y= .N;
   z1 = x+y;
   z2 = sum(x,y);
run;

data test;
   set example2;
   if y= .N then put 'Y has special missing N';
   if Y= . then put 'Y is missing';
run;

In the above code the special missing .N is assigned to Y. The X+Y yields a missing value per normal rules.

The sum function ignores the missing value(s) and returns the sum of the non-missing ones, in this case just X.

The test in the second data step will print one message in the log, not both.

 

 

 

 

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1252 views
  • 3 likes
  • 3 in conversation