- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am comparing a subset of data and want to create an error message depending on certain criteria. I want to ignore missing values however, and currently SAS is reading the missing values in my if then statement. Is there a do-loop or array that can ignore these missing values?
For example,
I have a dataset containing:
DOB DTP1 DTP2 DTP3 DTP4
2/24/2016 2/24/2016 4/25/2016 8/24/2016 6/26/2017
1/13/2016 01/16/2016 4/15/2016 . .
and I want to create variable ERROR = "DTP_ERROR" when any DTP1-DTP4 are less than 6 weeks after DOB.
Currently, below is my coding. However, it seems like SAS is ignoring my first "IF/THEN/DO" statement and counting missing variables as less than my , creating false errors.
DATA CLEANING.DATECOMPARE;
SET CLEANING.ALLVAX;
DATE6WK = HS_DOB + 42;
FORMAT DATE6WK DATE9.;
RUN;
DATA CLEANING.DATE_DTP;
SET CLEANING.DATECOMPARE;
IF DTP1-DTP4 NE . THEN DO;
IF DTP1 < DATE6WK THEN ERROR = 'DTP_DOSE';
IF DTP2 < DATE6WK THEN ERROR = 'DTP_DOSE';
IF DTP3 < DATE6WK THEN ERROR = 'DTP_DOSE';
IF DTP4 < DATE6WK THEN ERROR = 'DTP_DOSE';
END;
IF ERROR NE " " THEN OUTPUT;
RUN;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
A numeric missing value is smaller than any non-missing number, so expand your condition:
IF . < DTP1 < DATE6WK THEN ERROR = 'DTP_DOSE';
PS you might contemplate switching to lower-case coding, it's easier to read.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
A numeric missing value is smaller than any non-missing number, so expand your condition:
IF . < DTP1 < DATE6WK THEN ERROR = 'DTP_DOSE';
PS you might contemplate switching to lower-case coding, it's easier to read.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, that worked!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sounds like you are confused.
First SAS always treats missing numeric values as less any actual number. Run this little program to see.
data test;
do x=-1,0,1,.,.a,.z,._ ;
output;
end;
run;
proc sort; by x; run;
data _null_;
set test;
put _n_= x= ;
run;
So fix your conditions to account for that fact.
IF .Z < DTP1 < DATE6WK THEN ERROR = 'DTP_DOSE';
Or using the MISSING() function might be clearer.
IF DTP1 < DATE6WK and not missing(DTP1) THEN ERROR = 'DTP_DOSE';
Also why are you testing whether the difference between DPT1 and DPT4 is missing? That will be true when either value is missing. Did you instead want to test if they are all missing?
N(of DPT1-DPT4)=0
Or any of them is missing?
NMISS(of DPT1-DPT4)>0
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
and I want to create variable ERROR = "DTP_ERROR" when any DTP1-DTP4 are less than 6 weeks after DOB.
You can simplify this logic to take the minimum value and see if the minimum is less than 6 weeks after the DOB. Unless you need to know which date is less, which is still relatively easy.
DATA CLEANING.DATECOMPARE;
SET CLEANING.ALLVAX;
MIN_DATE = min(of dtp1-dtp4);
if min_date < (HS_DOB+42) then error = 'DTP_DOSE';
RUN;
@TPayne wrote:
Hello,
I am comparing a subset of data and want to create an error message depending on certain criteria. I want to ignore missing values however, and currently SAS is reading the missing values in my if then statement. Is there a do-loop or array that can ignore these missing values?
For example,
I have a dataset containing:
DOB DTP1 DTP2 DTP3 DTP4
2/24/2016 2/24/2016 4/25/2016 8/24/2016 6/26/2017
1/13/2016 01/16/2016 4/15/2016 . .
and I want to create variable ERROR = "DTP_ERROR" when any DTP1-DTP4 are less than 6 weeks after DOB.
Currently, below is my coding. However, it seems like SAS is ignoring my first "IF/THEN/DO" statement and counting missing variables as less than my , creating false errors.
DATA CLEANING.DATECOMPARE;
SET CLEANING.ALLVAX;
DATE6WK = HS_DOB + 42;
FORMAT DATE6WK DATE9.;
RUN;
DATA CLEANING.DATE_DTP;
SET CLEANING.DATECOMPARE;
IF DTP1-DTP4 NE . THEN DO;
IF DTP1 < DATE6WK THEN ERROR = 'DTP_DOSE';IF DTP2 < DATE6WK THEN ERROR = 'DTP_DOSE';
IF DTP3 < DATE6WK THEN ERROR = 'DTP_DOSE';
IF DTP4 < DATE6WK THEN ERROR = 'DTP_DOSE';
END;
IF ERROR NE " " THEN OUTPUT;
RUN;