- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It would be great if you can help simplify this very long if else statements. I'm only showing 3 statements here as a demonstration. Thanks in advance.
if M0 = "A" then M0 = 1;
else if M0 = "B" then M0 = 2;
else if M0 = "C" then M0 = 3;
else M0 = M0;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use a data step SELECT statement:
select (M0);
when ("A") M0 = "1";
when ("B") M0 = "2";
when ("C") M0 = "3";
otherwise; /* M0 = M0, so no change */
end;
Or create a format:
proc format;
value $m0_convert
"A" = "1"
"B" = "2"
"C" = "3"
;
run;
which you apply later as
M0 = put(M0,$m0_convert.);
Note that
if M0 = "A" then M0 = 1;
will cause an automatic conversion of numeric to character, with a NOTE in the log. It should be
if M0 = "A" then M0 = "1";
If you need to create a numeric variable, use the INPUT function, and a new variable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
thanks for your response. see my full code below. i want to do the "do loop" if possible.
data flagging;
set dataset;
keep M0 - M57;
if M0 = "A" then M0 = 1;
else if M0 = "B" then M0 = 2;
else if M0 = "C" then M0 = 3;
else M0 = M0;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data flagging;
set dataset;
array m {*} m0-m57;
do i = 1 to dim(m):
m{i} = put(m{i},$m0_convert.);
end;
drop i;
run;
You should also ask yourself about your dataset layout; transposing to long and omitting entries with invalid or missing values might make future coding easier.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Kurt. I'll take that code as a solution.
Below is what I'm trying to solve. And I have upto column M57. So I need the do loop.
From Table 1 to Table 2:
Table 1
ID | M0 | M1 | M2 | M3 | M4 | M5 |
1 | A | 0 | C | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 | 0 | D |
2 | 0 | 0 | D | 0 | 0 | C |
2 | B | 0 | 0 | 0 | 0 | 0 |
3 | C | A | 0 | 0 | 0 | 0 |
3 | 0 | 0 | 0 | 0 | 0 | A |
4 | D | 0 | 0 | 0 | 0 | 0 |
4 | 0 | 0 | 0 | 0 | 0 | D |
4 | 0 | 0 | 0 | 0 | 0 | 0 |
Table 2
ID | M0 | M1 | M2 | M3 | M4 | M5 |
1 | A | 0 | C | 0 | 0 | D |
1 | A | 0 | C | 0 | 0 | D |
1 | A | 0 | C | 0 | 0 | D |
2 | B | 0 | D | 0 | 0 | C |
2 | B | 0 | D | 0 | 0 | C |
3 | C | A | 0 | 0 | 0 | A |
3 | C | A | 0 | 0 | 0 | A |
4 | D | 0 | 0 | 0 | 0 | D |
4 | D | 0 | 0 | 0 | 0 | D |
4 | D | 0 | 0 | 0 | 0 | D |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@saweheh wrote:
Thanks Kurt. I'll take that code as a solution.
Below is what I'm trying to solve. And I have upto column M57. So I need the do loop.
From Table 1 to Table 2:
Table 1
ID M0 M1 M2 M3 M4 M5 1 A 0 C 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 D 2 0 0 D 0 0 C 2 B 0 0 0 0 0 3 C A 0 0 0 0 3 0 0 0 0 0 A 4 D 0 0 0 0 0 4 0 0 0 0 0 D 4 0 0 0 0 0 0
Table 2
ID M0 M1 M2 M3 M4 M5 1 A 0 C 0 0 D 1 A 0 C 0 0 D 1 A 0 C 0 0 D 2 B 0 D 0 0 C 2 B 0 D 0 0 C 3 C A 0 0 0 A 3 C A 0 0 0 A 4 D 0 0 0 0 D 4 D 0 0 0 0 D 4 D 0 0 0 0 D
This bares absolutely no resemblance to the original data. You are CONDITIONALLY assigning values and you are not assigning 1, 2, 3 for A B C. If I were to guess I would say that the "rule" is to assign the same non-"0" value to all observations within an ID if any of the variable values are not 0. Much different than your code that assigns other values based on the current value.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@saweheh wrote:
It would be great if you can help simplify this very long if else statements. I'm only showing 3 statements here as a demonstration. Thanks in advance.
if M0 = "A" then M0 = 1;
else if M0 = "B" then M0 = 2;
else if M0 = "C" then M0 = 3;else M0 = M0;
Are you attempting to create a numeric value for arithmetic? That won't happen because the variable is already character if your ="A" is correct.
BTW your else M0 = M0; is not needed. If the value isn't changed it remains the same.
If your "if/then/else" only involves one variable it might be worth investigating a custom format. That would not require changing your data at all, just associate the variable with the format.
Example:
Proc format; value $mo "A="1" "B"="2" "C"="3" ; run;
Formats can't end in a digit. This makes sense when you realize that any digit(s) at the end of a format control the number of displayed characters. That behavior is preserved for custom formats as well.
Any values not explicitly set to change in the format will display the current value.
You can use custom formats to create groups for analysis, graphing and reporting procedures.
You could use the format such as:
Proc print data=have; format M0 $mo.; run;
The format gains even more usability if have multiple variables that require the same display. ONE format can be applied to many variables, drastically reducing the number of if/then/else statements. A very nice feature of Formats is that you can have many and apply the one you need at a given time without having to back through a data step and create new variables or multiple reassignments. I have formats to turn age values into 5-year intervals, 10-year intervals, specific age-groups of interest to my clients such as 18-24, 25-40, 40+. So when I write a report I just use the desired formats to create the report groupings.
Another approach that won't gain much with the example is a SELECT block. Which would look like:
data want; set have; select (M0); when ("A") M0="1"; when ("B") M0="2"; when ("C") M0="3"; otherwise; end; run;
The value in the parentheses of the Select is compared to the values in the WHEN clauses and branches to that statement. Note the Otherwise which is what to do when none of the other When values are encountered. For simply coding like yours this is mostly just slightly easier reading. The advantages come when multiple values get mapped to 1 as you just provide a comma separated list: (Note: the format can do the same with multiple values to a single target).
when ("A","P","D","Q") M0="1";
A second advantage is that you can have a DO /End block as the target to execute multiple statements. If/Then/Else will handle that but the code can get ugly to read and hard to follow pretty quickly. Nicely indented Select is pretty easy to follow.