BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
newsasuser
Calcite | Level 5

I'm trying to select observations by using a range of values that contain decimals. It's important to use the exact numbers and not round.

This is what I tried:

 

data want; set have;

if variable in (800.0:801.9,803.0:804.9,850.0:854.1,950.1:950.3,995.55,959.01)  then output;run;

 

When I did proc print for an observation that had a value of 850.5 for my variable, it didn't show up.

I also tried using "-" instead of ":" and same thing. Short of writing out every single value, is there any way to make sure SAS will include every value in between each of those ranges?

 

Thanks,

Laura

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

The START:END shortcut syntax is only for integers.

Either type them out, or use a loop to generate a macro variable with the list.

Using a data step DO loop would be easier since you can list all of the groups in one DO statement.

data _null_;
  length list $5000 ;
  do x=800.0 to 801.9 by 0.1
      ,803.0 to 804.9 by 0.1
      ,850.0 to 854.1 by 0.1
      ,950.1 to 950.3 by 0.1
      ,995.55
      ,959.01
  ;
    list=catx(' ',list,put(x,6.2));
  end;
  call symputx('list',list);
run;

Then you can use the macro variable in your later code.

data want;
  set have;
  if variable in (&list);
run;

Note there is no need to use commas to separate the list of values given the IN() operator. Space delimiters are much easier to use with SAS code since you can pass them as macro parameter values without having to use macro quoting.

View solution in original post

10 REPLIES 10
mkeintz
PROC Star

Of course you didn't get an 850.5.  There is no literal 850.5 in your value list.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
OliviaWright
SAS Employee

Hi, @newsasuser!

 

The IN operator is meant to be used in character comparisons, as you specify lists of discrete values. This is why you cannot specify numeric ranges. The easiest way to accomplish your goal is to use the >, <, <=, >= operators. 

 

For example, 

data want;
set have;
if variable >= 800.0 and variable <= 800.9 then output; run;

and so on. Let me know if that helps!

 

 

Tom
Super User Tom
Super User

The START:END shortcut syntax is only for integers.

Either type them out, or use a loop to generate a macro variable with the list.

Using a data step DO loop would be easier since you can list all of the groups in one DO statement.

data _null_;
  length list $5000 ;
  do x=800.0 to 801.9 by 0.1
      ,803.0 to 804.9 by 0.1
      ,850.0 to 854.1 by 0.1
      ,950.1 to 950.3 by 0.1
      ,995.55
      ,959.01
  ;
    list=catx(' ',list,put(x,6.2));
  end;
  call symputx('list',list);
run;

Then you can use the macro variable in your later code.

data want;
  set have;
  if variable in (&list);
run;

Note there is no need to use commas to separate the list of values given the IN() operator. Space delimiters are much easier to use with SAS code since you can pass them as macro parameter values without having to use macro quoting.

newsasuser
Calcite | Level 5
This worked perfectly, thanks!
newsasuser
Calcite | Level 5

After thinking it worked initially, I've realized that it isn't capturing all the ranges. It's only capturing 850-853.19.

This is the code I used (modified slightly because include hundredth decimal place):

 

data _null_;
length list $5000 ;
do x=800.0 to 801.9 by 0.01
,803.0 to 804.9 by 0.01
,850.0 to 854.1 by 0.01
,950.1 to 950.3 by 0.01
,959.01
,995.55
;
list=catx(' ',list,put(x,6.3));
end;
call symput('list',list);
run;

 

Any thoughts?

Thanks in advance

Tom
Super User Tom
Super User

5 thousand characters is not long enough for such a large list.

Change the length of the character variable, the maximum length is 32767.

Also why are you use 6.3 format instead of 6.2?

ballardw
Super User

Did you notice the ERROR in the log?

 

From the documentation:

The IN Operator in Numeric Comparisons

You can use a shorthand notation to specify a range of sequential integers to search.
 
I am afraid you are going to have to write something like:
 
if   (800 le variable le 801.9) or (803 le variable le 804.9) or () or ().
 
OR another option is to create a custom format
proc format library=work;
value mykeep
800  - 801.9 ='Keep'
803.0 - 804.9 ='Keep'
850.0 - 854.1 ='Keep'
950.1 - 950.3 ='Keep'
995.55 ='Keep'
959.01='Keep'
other='Do not keep'
;
run;
and use:
If put(variable,Mykeep.)='Keep' then output;
 
The format approach does allow using the options that will keep the endpoints or not and could also allow use of High and LOW ranges values.
Astounding
PROC Star

If your variable never extends beyond 2 decimal places, you could fudge the result easily enough:

 

data want;

set have;

if 100 * variable in (80000:80190, 80300:80490, 85000:85410, 95010:95030, 99555, 95901)  then output;

run;

mkeintz
PROC Star

As per @Astounding,s nifty suggestion.   if the decimals do go beyond two digits, you can use the FLOOR and CEIL functions with his technique:

 

 

data have;

  do variable=801.00,801.1,801.15,801.155,801.90006;

    output;

  end;

run;

data want;

  set have;

  if floor(100 * variable) in

    (80000:80190, 80300:80490, 85000:85410,

 

     95010:95030, 99555      , 95901)

  and ceil(100 * variable) in

    (80000:80190, 80300:80490, 85000:85410,

     95010:95030, 99555      , 95901)

  then output;

run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Yavuz
Quartz | Level 8
If your criteria is not long you can
use case statement with proc sql :

Case when variable=800 then "1"
when variable between 801.9 and 803 then "1"
.
.
Else "0"
End

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 4699 views
  • 3 likes
  • 7 in conversation