BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Yogesh2601
Calcite | Level 5

Hi,

This is my first post/question on this site and I am a beginner.

I want to have a flag variable corresponding to a data set such that it changes value as the value of a certain variable changes.

For ex: if variable Age has the values a, b, c, d then variable flag takes the value 1,2,3,4 respectively for each occurence of values a,b,c,d

Please help.

 

Regards,

Yogesh Singh

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

There are many ways to conditionally assign values to another variable. This will usually be done in a datastep.

The basic approach is IF THEN ELSE.

 

Data want;

    set have;

    if Age='a' then flag=1;

    else if Age='b' then flag=2;

    else if Age='c' then flag=3;

    else if Age='d' then flag=4;

run;

However this construct can get lengthy if you have many values. If you have a range of values such as a, b, c, d should have a flag of 1 and e, f, g, h should have a flag of 2 you can use an IN operator with a list:

if Age in ('a' 'b' 'c' 'd') then flag=1;

else if Age in ('e' 'f' 'g' 'h') then flag=2;

If your comparison is a range of numeric values such as 1 to 10

if Age in (1:10) then flag=1;

 

Longer or more complex approaches might be a Select statement.

Custom formats are also often used especially if multiple variables may have a similar coding need. In some cases formats are perferred because by assigning a format to the existing variable you may not need to add another variable at all for many analysis tasks.

View solution in original post

10 REPLIES 10
Reeza
Super User

This is called recoding a variable. 

 

You can use IF/THEN statements

http://www.ats.ucla.edu/stat/sas/modules/vars.htm

 

If this will be done multiple times or is really long then you can consider using a format instead.

http://www2.sas.com/proceedings/sugi30/001-30.pdf

Yogesh2601
Calcite | Level 5
I did it with IF/THEN/ELSE but is there any easier way to do it?
Astounding
PROC Star

Here are some issues that are important to finding the right solution.

 

First, are other variables involved?  For example, are there multiple records for each patient and the pattern (1, 2, 3, 4) needs to start over again when there is a different patient?

 

Second, what about repeats?  For example, what should the output look like if the variable in question follows the pattern a, b, a, c so that the variable is changing but one of the values repeats an earlier value?

Yogesh2601
Calcite | Level 5
No other variables are involved. And for repeat cases the flage variable should have the same value.

For ex: Age Flag
11 1
12 2
13 3
14 4
11 1
15 5

I did it with IF/THEN/ELSE. Please suggest an alternative solution.
Reeza
Super User

If you only have 4 values your unlikely to find a solution that's actually less code. 

The only one I can think of is an array look up,,

 

 

Array age_fmt(*) _temporary_ (11:15);

 

Flag = whichn(age, of age_fmt(*));

 

 

Also, since you seem to have a pattern in your age/flag

 

Flag = age - 10;

 

ballardw
Super User

There are many ways to conditionally assign values to another variable. This will usually be done in a datastep.

The basic approach is IF THEN ELSE.

 

Data want;

    set have;

    if Age='a' then flag=1;

    else if Age='b' then flag=2;

    else if Age='c' then flag=3;

    else if Age='d' then flag=4;

run;

However this construct can get lengthy if you have many values. If you have a range of values such as a, b, c, d should have a flag of 1 and e, f, g, h should have a flag of 2 you can use an IN operator with a list:

if Age in ('a' 'b' 'c' 'd') then flag=1;

else if Age in ('e' 'f' 'g' 'h') then flag=2;

If your comparison is a range of numeric values such as 1 to 10

if Age in (1:10) then flag=1;

 

Longer or more complex approaches might be a Select statement.

Custom formats are also often used especially if multiple variables may have a similar coding need. In some cases formats are perferred because by assigning a format to the existing variable you may not need to add another variable at all for many analysis tasks.

Yogesh2601
Calcite | Level 5
I did it with IF/ELSE. Is there any easier way to do it?
Astounding
PROC Star

When you have just a few possible values, there's no easier way to do it. 

 

When you have large numbers of values, there are still many ways to do it.  But it's difficult to find a short, simple way.  Here's one possible approach:

 

proc sql;

select distinct age into : agelist separated by ' ' from have where age > .;

quit;

 

data want;

array ages {&sqlobs} _temporary_ (&agelist);

set have;

flag = whichn(age, of ages{*});

run;

 

Note that this will not assign a value to FLAG when age is missing.  And it requires AGE to be numeric.  (There are slightly more complex ways that this approach could be adapted to character variables.)

ballardw
Super User

You may want to define "easier". Things to consider in choosing between methods for a task revolve around how easy it may be to understand the code, how robust to modify and sometimes use of resources such as memory or cpu cycles.

 

IF-then-else usually satisfies the first for single variables, not so much if you are using multiple varaibles and multiple "ors" and "ands" in the comparisons. It may also be easy to modify when adding on or two more values.

 

A similar structure is the SELECT-when which evaluates an expression a the top and then branches to the specific value:

 

select (age);

   when ('a') flag=1;

   when ('b') flag=2;

   when ('c') flag=3;

   when ('d') flag=4;

   otherwise; /* this what you would do when none of the other values is encountered*/

end;

 

Select has a big advantage if the the number of values gets moderately large in that is usually easier to identify the condition and the result quicker than if/then/else. Also the part after the condition can be a do/end code block and execute multiple statements which gets much messier with if/then/else.

You can also provide lists of values in the when such as when ('a','f','g').

 

Yogesh2601
Calcite | Level 5

Thank you guys for your help.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1227 views
  • 1 like
  • 4 in conversation