BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
altadata1
Calcite | Level 5

Hello, 

I have a variable in my dataset. The categories for this variable are alphanumeric and in character format. For example, they are: A00 to A99, B60 to B89, and so on.

Now, I want to list some of them in my SAS syntax:

 

 

Data want; 

       set have; 

       If myvariable in A20 to A45 then myfavourite = 1;

       else myfavourite = 0;

run; 

Any help would be much appreciated. 

Thanks. 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @altadata1,

 

A simple IF condition such as

'A20'<=myvariable<='A45'

would incorrectly include values like 'A2b' or 'A234' if such values occurred in myvariable.

 

So you may want to use a very restrictive condition, e.g.

length(myvariable)=3 & myvariable=:'A' & 20<=input(substr(myvariable,2),?? 2.)<=45

where you can omit the length check if the defined length of variable myvariable is 3 anyway.

 

By using a Perl regular expression you can obtain the same result with shorter code:

data want;
set have;
myfavourite=prxmatch('/^(A[23]\d|A4[0-5])$/',trim(myvariable));
run;

View solution in original post

4 REPLIES 4
ballardw
Super User

Then IN operator first requires ( ) around the values. Second the only "list" behavior it understands is integer numeric values where 5:15 are the integers from 5 to 15 inclusive.

 

Character values need to be in quotes so you need to provide something that looks like:

if myvariable in ( 'A20' 'A21' 'A22' ... 'A40') then <whatever>

For an example like yours a macro to build the list may be practical:

%macro stemseq (Stem,start, end);
   %let result=;
   %do i=&start %to &end;
%let i = %sysfunc(putn(&i,z2.)); %let result = &result "&Stem.&i"; %end; &result %mend; data have; input var $; datalines; A10 A20 A34 B16 ; data example; set have; myfavorite = (var in (%stemseq(A,15,40) ) ); run;

A few important things about this approach:

The macro executes during the compile the stage of the data step using it. That means the macro cannot see variable values in your data set.

You can provide two sequences but each would require a separate call to the macro for example

data example;
   set have;
   myfavorite = (var in (%stemseq(A,15,40) %stemseq(B,10,18) ) );
run;

If you place other values that you may want in the code properly then you can include other values:

 

The comparison when you use IN is case sensitive. If the value is "A10" you want make sure to use A in the macro.

The last line of the macro code DOESNOT and should not have a semicolon. If you code in SAS long enough it may feel wrong but we want to expose the result as code lines and if you include a ; then that will be placed in the code as well.

The MACRO code from the %macro through %mend; has to be executed once to compile the code. Then it can be used multiple times during that SAS session. When you restart SAS you need to recompile the macro to use it.

If you want to just test what the macro creates you can use code like:

%put %stemseq(A,15,40);

% is interpreted by SAS as a macro language code flag.

The macro also uses the %sysfunc function to force leading zeroes into values . It is limited to 2 characters per your data description.

 

 

altadata1
Calcite | Level 5

Thank you ballardw. Sorry, I should have explained it more. I know how to specify character value (double quote) and i can write it like this: 

if myvariable in ( 'A20' 'A21' 'A22' ... 'A40') then <whatever>

I want to avoid writing all the values because there are many of them, for example from X00 to X99.  

 

Thanks again. 

FreelanceReinh
Jade | Level 19

Hello @altadata1,

 

A simple IF condition such as

'A20'<=myvariable<='A45'

would incorrectly include values like 'A2b' or 'A234' if such values occurred in myvariable.

 

So you may want to use a very restrictive condition, e.g.

length(myvariable)=3 & myvariable=:'A' & 20<=input(substr(myvariable,2),?? 2.)<=45

where you can omit the length check if the defined length of variable myvariable is 3 anyway.

 

By using a Perl regular expression you can obtain the same result with shorter code:

data want;
set have;
myfavourite=prxmatch('/^(A[23]\d|A4[0-5])$/',trim(myvariable));
run;
altadata1
Calcite | Level 5

It works well. Thank you so much. 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 392 views
  • 0 likes
  • 3 in conversation