Hi,
I think the code I run below doesn't work properly because the latest AND statement is not BLUE but grey instead. When I run the code, I don't get any errors though. Is there anything wrong with the syntax?
data data2;
set data1;
where (( (year(datepart(m_date))= 2022) AND year (datepart(c_date)) >= 2022 )
OR ( (year(datepart(c_date))=2022) AND ( missing (year(datepart(m_date))))));
run;
Besides the points by @PaigeMiller , I think you have a few parentheses you don't need. This is simpler and probably easier to debug.
data data2;
set data1;
where ( year(datepart(m_date)) = 2022 AND year(datepart(c_date)) >= 2022 )
OR ( year(datepart(c_date)) = 2022 AND missing(year(datepart(_date))) )
;
run;
@znhnm wrote:
When I run the code, I don't get any errors though. Is there anything wrong with the syntax?
Are there errors in the log? If so, show us the log (the ENTIRE log for this data step, not selected parts of the log for this data step)
If there are no errors in the log, then there is nothing wrong with the syntax, so there must be something wrong with the logic, but I have no idea what logic you want. Can you explain the logic in words?
Besides the points by @PaigeMiller , I think you have a few parentheses you don't need. This is simpler and probably easier to debug.
data data2;
set data1;
where ( year(datepart(m_date)) = 2022 AND year(datepart(c_date)) >= 2022 )
OR ( year(datepart(c_date)) = 2022 AND missing(year(datepart(_date))) )
;
run;
The editor (IDE) does the colorizing of your code, and it tries to do it well. But the IDE just has simple coloring rules. It doesn't have all the complexity of the SAS compiler. Sometimes it colors code wrong. But this doesn't effect how the code compiles and executes. So when code is colored wrong, it's helpful to take a second look at it to make sure it's correct, but you don't need to worry about the color.
AND and OR are not statements, they are logical operators.
DATA, SET, WHERE and RUN in your example are statements.
Your multiple unnecessary parentheses cause the limited intelligence of the Enhanced Editor to run out of steam. What's even worse, they make your code hard to understand, and must therefore be avoided.
where
year(datepart(m_date)) = 2022 and year(datepart(c_date)) >= 2022
or
year(datepart(m_date)) = . and year(datepart(c_date)) = 2022
;
is logically equivalent and much easier to read, as the logical structure is expressed by using separate lines for parts of the condition. Also note the consistent use of whitespace, and that the evaluation of m_date comes before that of c_date in both parts. Shuffling variables around between code parts can (and will) cause mistakes in the future when code is maintained.
@Kurt_Bremser I like the multi-line approach, but in order to understand this, you need to know the precedence of AND and OR:
where
year(datepart(m_date)) = 2022 and year(datepart(c_date)) >= 2022
or
year(datepart(m_date)) = . and year(datepart(c_date)) = 2022
;
I often use extra parentheses, just to make the order of operations explicit, e.g.:
where
(year(datepart(m_date)) = 2022 and year(datepart(c_date)) >= 2022)
or
(year(datepart(m_date)) = . and year(datepart(c_date)) = 2022)
;
or even:
where
(
year(datepart(m_date)) = 2022 and year(datepart(c_date)) >= 2022
)
or
(
year(datepart(m_date)) = . and year(datepart(c_date)) = 2022
)
;
Please do not rely on syntax highlighting to determine valid code. It is helpful and odd coloring may sometimes indicate the presence of characters off screen to the right. But the highlighter is sometime wrong. FWIW none of the "AND" appear in blue when pasted into my SAS session.
Not "wrong" but the only way for a Year(datepart(somevariable)) can be missing is if the Somevariable is missing (or not a valid datetime to begin with resulting in a "year" that is out of range for the datepart function) so you could simplify to
missing(m_date)
General advice: pick a solution you feel comfortable with. Don't agree with a solution that you don't understand. If you really want to simplify the code, you could use:
data data2;
set data1;
if year(datepart(m_date))=2022 AND year (datepart(c_date)) >= 2022 then output;
if year(datepart(c_date))=2022 AND missing(year(datepart(m_date))) then output;
run;
It's clumsy, since you can't add more programming statements to the same DATA step (because the observations have already been output by the OUTPUT statement). But if it makes the code readable in your eyes, that's an important feature.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.