Hi,
I have multiple if then else statements
if educ = 0 then neweduc="< 3 yrs old";
else if educ=1 then neweduc="no school";
else if educ=2 then neweduc="nursery school";
else if educ in (15,16) then neweduc="Profess. degree";
else if educ in (15,17) then neweduc="Doctorate degree";
If suppose two or more conditions are true I want to display all of them by each Id how do I do that? In this way the with multiple if then else my values are overwritten if the first condition is true like 15 for 'Profess. degree' and also the second one 15 for Doctorate so I want to display both the degrees in the output. Thanks for the help.
The desired output would be like:
ID EDU NEWEDUC
1 1 no school
2 15 Profess. degree
2 15 Doctorate degree
Write them out as separate categories. Right now you have:
else if educ in (15,16) then neweduc="Profess. degree";
else if educ in (15,17) then neweduc="Doctorate degree";
Change that to:
else if educ = 15 then neweduc="Profess. degree, Doctorate";
else if educ = 16 then neweduc="Profess. degree";
else if educ = 17 then neweduc="Doctorate degree";
Be sure to assign NEWEDUC a long enough length before the IF/THEN statements.
@Aidaan_10 wrote:
Hi,
I have multiple if then else statements
if educ = 0 then neweduc="< 3 yrs old";
else if educ=1 then neweduc="no school";
else if educ=2 then neweduc="nursery school";
else if educ in (15,16) then neweduc="Profess. degree";
else if educ in (15,17) then neweduc="Doctorate degree";
If suppose two or more conditions are true I want to display all of them by each Id how do I do that? In this way the with multiple if then else my values are overwritten if the first condition is true like 15 for 'Profess. degree' and also the second one 15 for Doctorate so I want to display both the degrees in the output. Thanks for the help.
The desired output would be like:
ID EDU NEWEDUC
1 1 no school
2 15 Profess. degree
2 15 Doctorate degree
You should provide example of the input data.
It appears that you want may want to create a new record for an id when educ= 15 (Only?). Is that correct.
Also, what do you want when educ is something other thatn 0,1,2,15, 16 and 17?
Typically I might use a custom format to assign values as needed but ince you are doing something different with one specific value the format approach likely would not work.
I think a SELECT when block works for this.
data want; set have; length neweduc $ 20; select (educ); when (0) neweduc="<3 years old"; when (1) neweduc="no school"; when (2) neweduc="nursery school"; when (15) do; neweduc='Profess. degree';output; neweduc='Doctorate degree'; end; when (16) neweduc='Profess. degree'; when (17) neweduc='Doctorate degree'; otherwise;/* what ever you may want to do when none of the above values are encountered*/ end; /* this output is needed because of the explicit output in the when(15)for the first value*/ output; run;
SELECT evaluates the value in the () then branches to the When statement with that value. You can have multiple values in a when such as when( 4,5,6). Notice that "then" is not used to do the assignment. You can have a block of instructions inside a do/end block of code as for the when (15). A value may only occur in a single WHEN. The OTHERWISE is to execute code for any values not explicitly listed.
Note that you also need to state the length of the variable before use in this case. Your code would set the length of neweduc to 10 characters because the first use is in the neweduc="< 3 yrs old"; The length would be long enough to hold all of the values you use.
There are other ways to use select but this should work for what I think you are attempting.
Hi ballard,
I tried using select when also.. it did not work. I got the results using a lot of if then else statements and I am using 3 variables here but I have several variables in my data..so I was looking for an alternate effective method. Please have a look at what I have used. As I have used different combinations of variables to check the condition.
if ( (var1<70 and var1<40) and var1> .) and ( (var2<70 and var2<40) and var2> .) and ( (var3<70 and var3<40) and var3> .)
then
Total = ('Score1" || " (" || strip(var1) || ")" )||
( "Score2" || " (" || strip(var2) || ")" ||
( "Score3" || " (" || strip(var3) || ")"
;
/***here I concatenated the score's in one single variable and I got the result like total=Score1(45) Score2(20) Score3(38)
but I want to see all 3 score's if they exist for a ID in different rows as below
OBS ID TOTAL
1) Subject1 SCORE1(45)
2) Subject1 SCORE2(20)
3) Subject1 SCORE3(38) ***/
Else
if ( (var1<70 and var1<40) and var1> .) and ( (var2<70 and var2<40) and var2> .) then
total = ("Score1" || " (" || strip(var1) || ")")||
("Score2" || " (" || strip(var2) || ")" ;
Else
if ( (var1<70 and var1<40) and var1> .) and ( (var3<70 and var3<40) and var3> .)
then
Total = ('Score1" || " (" || strip(var1) || ")" )||
(' Score3" || " (" || strip(var3) || ")" )
;
Else
if ( (var2<70 and var2<40) and var2> .) and ( (var3<70 and var3<40) and var3> .)
then
Total = ( "Score2" || " (" || strip(var2) || ")") ||
(" Score3" || " (" || strip(var3) || ")" )
;
Else
if ( (var1<70 and var1<40) and var1> .) then
Total = " Score1" || " (" || strip(var1) || ")" ;
Else if ( (var2<70 and var2<40) and var2> .) then
Total = "Score2" || " (" || strip(var2) || ")" ;
Else if ( (var3<70 and var3<40) and var3> .) then
Total = " Score3" || " (" || strip(var3) || ")" ;
@Aidaan_10 wrote:
Hi ballard,
I tried using select when also.. it did not work. I got the results using a lot of if then else statements and I am using 3 variables here but I have several variables in my data..so I was looking for an alternate effective method. Please have a look at what I have used. As I have used different combinations of variables to check the condition.
Please show the code that used when you tried SELECT.
Doesn't work is awful vague.
Are there errors in the log?: Post the code and log in a code box opened with the {i} to maintain formatting of error messages.
No output? Post any log in a code box.
Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
if ( (var1<70 and var1<40) and var1> .) and ( (var2<70 and var2<40) and var2> .) and ( (var3<70 and var3<40) and var3> .)
then
Total = ('Score1" || " (" || strip(var1) || ")" )||
( "Score2" || " (" || strip(var2) || ")" ||
( "Score3" || " (" || strip(var3) || ")"
;/***here I concatenated the score's in one single variable and I got the result like total=Score1(45) Score2(20) Score3(38)
but I want to see all 3 score's if they exist for a ID in different rows as below
OBS ID TOTAL
1) Subject1 SCORE1(45)
2) Subject1 SCORE2(20)
3) Subject1 SCORE3(38) ***/
Else
if ( (var1<70 and var1<40) and var1> .) and ( (var2<70 and var2<40) and var2> .) then
total = ("Score1" || " (" || strip(var1) || ")")||
("Score2" || " (" || strip(var2) || ")" ;
Else
if ( (var1<70 and var1<40) and var1> .) and ( (var3<70 and var3<40) and var3> .)
then
Total = ('Score1" || " (" || strip(var1) || ")" )||
(' Score3" || " (" || strip(var3) || ")" )
;
Else
if ( (var2<70 and var2<40) and var2> .) and ( (var3<70 and var3<40) and var3> .)
then
Total = ( "Score2" || " (" || strip(var2) || ")") ||
(" Score3" || " (" || strip(var3) || ")" )
;
Else
if ( (var1<70 and var1<40) and var1> .) then
Total = " Score1" || " (" || strip(var1) || ")" ;
Else if ( (var2<70 and var2<40) and var2> .) thenTotal = "Score2" || " (" || strip(var2) || ")" ;
Else if ( (var3<70 and var3<40) and var3> .) then
Total = " Score3" || " (" || strip(var3) || ")" ;
A general comment on the above code:
if ( (var1<70 and var1<40) and var1> .)
The <70 part is not contributing to your results as anything <40 is also less than 70. In effect you are only selecting values less than 40. Was that the intent? If so (. < var1 <40) would be more appropriate.
You may want to look at the concatenation functions CATX, CATS and CATT instead of using the || operator.
Total = ('Score1" || " (" || strip(var1) || ")" )|| ( "Score2" || " (" || strip(var2) || ")" || ( "Score3" || " (" || strip(var3) || ")" ;
by using one of the concatenation functions. Also the first quote as a ' , single quote, followed by a bunch of " means that there are unbalanced quotes in that statement.
I think
Total= catx(' ',"Score1",cats('(',var1,')'),"Score2",cats('(',var2,')'),"Score3",cats('(',var3,')'));
which should also not generate messages in the log about conversion to character values.
Also the code you post actually bares little resemblance to the if/then/else you started with as you are now involving multiple different variables.
Also please describe what you mean by "I want to see all 3 score's if they exist for a ID in different rows as below". Do want a single value that will, when printed attempt to add lines like the (ever hateful to data users) Excel Alt-Enter between values in a cell?
That would require adding something between values and depends greatly on what procedures you are going to use to display the value.
If this were my code I would create 3 temporary total variables and then concatenate them together something like:
I don't really know what you meant using 70 above and none of the code you showed would have a result for values between 70 and 40 so don't really have much range for the values other than one greater than 70 that gets excluded as your code would have.
data example; input var1 var2 var3; if .<var1<40 then t1= catx(' ',"Score1",cats('(',var1,')')); if .<var2<40 then t2= catx(' ',"Score2",cats('(',var2,')')); if .<var3<40 then t3= catx(' ',"Score3",cats('(',var3,')')); total = catx(' ',t1,t2,t3); drop t1 t2 t3; datalines; 20 30 35 . 25 . 15 . 99 ; run;
Hi ballard,
Sorry, I guess without an example that code was not of much use. Please have a look at this. I provided a screenshot of Ex_2 dataset .
data example;
input id visit$ visit_num var1 var2 var3;
datalines;
1 Visit1 1 20 30 35
1 Visit2 2 75 35 45
2 Visit1 3 99 15 34
2 Visit2 4 55 38 48
;
run;
data ex_2;
set example;
by visit_num visit ;
if not(40<var1<70) then t1= catx(' ',"Score1",cats('(',var1,')'));
if not(40<var2<70) then t2= catx(' ',"Score2",cats('(',var2,')'));
if not(40<var3<70) then t3= catx(' ',"Score3",cats('(',var3,')'));
total = catx(' ',t1,t2,t3);
drop t1 t2 t3;
run;
So now my question is, as you can see in the screenshot the highlighted row met both the conditions for Score1 (75) Score2 (35) at Visit2 for ID 1 so how can I code so that I get the output as below. Something like different score's that meet the conditions into different observation for a subject at that visit.Thanks in advance for the help.
ID VISIT VISIT_NUM TOTAL
1 Visit2 2 Score1(75)
1 Visit2 2 Score2(35)
data example;
input id visit$ visit_num var1 var2 var3;
datalines;
1 Visit1 1 20 30 35
1 Visit2 2 75 35 45
2 Visit1 3 99 15 34
2 Visit2 4 55 38 48
;
run;
data ex_2;
set example;
array my_arr[3] var1-var3;
do i=1 to 3;
if not(40<my_arr[i]<70) then
do;
TOTAL=cats("Score",i,'(',my_arr[i],')');
output;
end;
end;
drop var1-var3 i;
run;
proc print;
Okay now that we have some clarity of output. Two possibly approaches depending on range of values:
Quite often moderately complex logic involving a single variable can be solved with a custom format.
Proc format library=work; picture score (default=11) 0 -< 40 = "09)" (prefix="Score (") 70 <- 100 = "009)" (prefix="Score (") other=" " ; run; data example; input id visit$ visit_num var1 var2 var3; /* string manipulation approach*/ length t1 t2 t3 $ 11; /* if (.<var1<40) or var1>70 then t1= catx(' ',"Score1",cats('(',var1,')'));*/ /* if (.<var2<40) or var2>70 then t2= catx(' ',"Score2",cats('(',var2,')'));*/ /* if (.<var3<40) or var3>70 then t3= catx(' ',"Score3",cats('(',var3,')'));*/ /* use of a format*/ t1=put(var1,score.); t2=put(var2,score.); t3=put(var3,score.); tt = catx('|',t1,t2,t3); drop t1 t2 t3 tt i; do i= 1 to countw(tt,'|'); total=scan(tt,i,'|'); output; end; datalines; 1 Visit1 1 20 30 35 1 Visit2 2 75 35 45 2 Visit1 3 99 15 34 2 Visit2 4 55 38 48 3 Visit1 5 . . . 3 Visit1 6 100 . . ; run;
I commented out an approach using direct sting manipulation to show the correct logic for your range selection.
The Proc format values list used by Picture, value and invalue statements can use a <- b, a-b, a-<b or a<-<b which correspond to intervals of (a,b] [a,b] [a,b) and (a,b) where [ or ] indicate the boundary value is included and ( or ) indicate the boundary value is excluded.
You did not indicate the maximum value a score might have. The format I defined only goes to 100. If you need larger values then increase 100 to that value. If you need more than 3 digits then you will need to add additional 0 before the 009 to match that length and you should increase the (default= ) value to match the increased characters. The default indicates the default length of a result if a length is not assigned to a target variable. The prefix tells the format to place that text in front of the resolved format value, the 09 and 009 are digit selectors, the 0 are optional and the 9(or any digit except 0) indicates a required value. If you have additional 9 in the format and the value won't fill the picture such as a value of 2 with a '99' digit selector you get a leading 0 displayed: 02. The ) after the 9 is the last character to display.
If you have many categories your code will be hard to maintain if you use the if-else approach.
* my solution;
data degree;
length degree $20.;
infile datalines dsd;
input degree $ educ $;
datalines;
"< 3 yrs old",0
"no school",1
"nursery school",2
"Profess. degree",15 16
"Doctorate degree",15 17
;
run;
data person;
input id educ;
datalines;
1 1
2 15
3 17
4 15
;
run;
data degree;
set degree(rename=(educ=educ_old));
drop nwords i educ_old;
nwords=countw(educ_old);
do i=1 to nwords;
educ=input(scan(educ_old, i), 2.);
output;
end;
run;
proc sql;
create table sol_1 as
select person.*, degree.degree
from person natural join degree;
quit;
proc print; title 'sol 1'; * (sol 1);
* if you don't want to use sql/join;
proc sort data=degree; by educ;
data degree;
length degree $40.;
retain degree;
drop degree_old;
set degree(rename=(degree=degree_old));
by educ;
if first.educ then degree=degree_old;
else degree=catx('; ', degree, degree_old);
if last.educ then output;
run;
proc sort data=person; by educ;
data sol_2;
merge person(in=in_person) degree;
by educ; if in_person;
run;
proc print; title 'sol 2'; * which could serve your purposes (sol2);
data sol_3;
merge person(in=in_person) degree;
by educ; if in_person;
retain degrees;
drop degrees nword i;
if first.educ then degrees=degree;
nword=countw(degrees, ';');
do i=1 to nword;
degree=scan(degrees, i, ';');
output;
end;
run;
proc print; title 'sol 3';
which produces this output:
Notice solution 3 might fail if you have many records per person; in that case, it would need additional tuning;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.