Hello team,
I have:
data mydata; set mydatafromyesterday; if id in ("a", "b", "c") and payer = "d" then item="iphone" if id in ("a", "b","c","r") and payer in ( "d", "f") then item="washing machine" run;
Does SAS have database engine? When it processes the first "if", if the record satisfies the first if, does sas ingnore second if?
Or
Does SAS process the second if as well?
I appreciate your response.
BittenApple
Neither of your IF statements will execute without a semi-colon on the end.
Fixing the semi-colon issue:
Both IF statements are executed, regardless of the result of the first IF statement.
If you want the IF second statement to execute only if the first IF statement is false, then use
if id in ("a", "b", "c") and payer = "d" then item="iphone";
else if id in ("a", "b","c","r") and payer in ( "d", "f") then item="washing machine";
But the logic seems flawed. It seems to be implying that id='a' and payer='d' could be item="washing machine", but that will never happen because id='a' and payer='d' will always be assigned to item='iphone'.
So in my mind, the problem is that your logic is flawed, and we can't advise further about the right way to program this until your logic is corrected.
Really, spend some time and figure out the proper logic in words before you try to write SAS code.
This still doesn't make sense to me, especially given the code you wrote where iphone could be program a b c or d
In reality the iphone should be program A and washing machine is Program B. Program A is a subset of Program B
Hello,
Are you saying if we have a bunch of if and we have consecutive if statements then the first record that satisfies the criteria will be back in the result set. That is what I want to know. Which record will be in the result set?
I need to mention Program A is a subset of Program B, so they have common letters.
Regards,
Blueinthesky
I hate to repeat myself, but it is obviously necessary:
None of your IF statements are subsetting IFs, so all observations will make it to the output dataset.
Post the complete log from your DATA step.
Hello team,
I changed this code:
if id in ("a", "b", "c") and payer = "d" then item="iphone"; else if id in ("a", "b","c","r") and payer in ( "d", "f") then item="washing machine"; to this: if id in ("a", "b", "c") and payer = "d" then HealthPlan ="HMO"; else if id in ("a", "b","c","r") and payer = "f" then Healthplan="covered california";
Is the logic correct now?
Please advise me.
Thanks,
BlueBlue
@GN0001 wrote:
Hello team,
I changed this code:
if id in ("a", "b", "c") and payer = "d" then item="iphone"; else if id in ("a", "b","c","r") and payer in ( "d", "f") then item="washing machine"; to this: if id in ("a", "b", "c") and payer = "d" then HealthPlan ="HMO"; else if id in ("a", "b","c","r") and payer = "f" then Healthplan="covered california";Is the logic correct now?
Please advise me.
Thanks,
BlueBlue
You will need to answer a couple of questions first about what is "correct".
Does the variable HealthPlan already exist in your data set? If so, what is the defined length of the variable?
If HealthPlan is not already in your data set what is the length of the longest value you ever expect to assign to the HealthPlan variable? This is important because if you do not explicitly set a length for a new variable then the first use will set the length. Without a length the above code would assign a length of 3 to HealthPlan if this is a new variable. Which means the second result would actually be "cov".
So if HealthPlan is a new variable you want to assign a length first:
Example setting the length so that "covered california" will fit. If you have a longer value then replace 18 with a number of characters long enough to hold that longer value. (Hint: I quit often make my lengths 5 or 10 characters longer because someone will come along with a new value that might fit within that guessed length so I don't have to deal with multiple length values which can cause some issues later).
Length HealthPlan $ 18 ; if id in ("a", "b", "c") and payer = "d" then HealthPlan ="HMO"; else if id in ("a", "b","c","r") and payer = "f" then Healthplan="covered california";
This has a much better chance of being correct as your previous code was attempting to assign multiple values for the same ID and Payer (Id='a' and payer='d' for example were in both of your If, which means when the first assigned item='iphone' you would not get "washing machine" as well).
Note that with character comparisons with the IN operator that the values must be identical to be true. So if you have an ID='A', upper case, it will not match the 'a' in the list of values the IN uses. If your actual values may have case differences that is easy to fix by using one of the LOWCASE or UPCASE functions on the ID to force the value of 'A' into to correct case to match the list. However if you have minor spelling differences then you might want to fix them to a standard spelling before doing this comparison.
Hello,
Please kindly view my responses below:
Does the variable HealthPlan already exist in your data set? If so, what is the defined length of the variable?
No, it doesn't exist:
data mydata;
format HealthPlan $65;
set fromthisdata;
run;
All ids are lower case, so we don't need to use any uppercase or lowercase functions.
This is my code:
data mydata; format HealthPlan $65; set fromthisdata; if id in ("a", "b", "c") and payer = "d" then HealthPlan ="HMO"; else if id in ("a", "b","c","r") and payer = "f" then Healthplan="covered california"; run;
Respectfully,
Blueblue
PROC FREQ is a great way of seeing if you are getting the desired outcome with your logic. Try this:
proc freq data = mydata;
table HealthPlan * payer * id / list missing;
run;
Nothing is processed, as this code is syntactically invalid and will throw ERRORs.
Multiple successive IF statements are just that, successive statements which are executed in succession, once they are written in a correct manner. Both of your IF statements are "normal", not "subsetting" IFs, they cannot break the order of execution of the data step (both statements have a THEN!).
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.