Please tell me what is wrong with this code;
Data Labtests.July1719_Test1a;
set Labtests.July1719_Tests;
if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="GC nucleic acid amplification test (NAAT)" AND Result_1="GC Negative" then NAATG="Neg";
if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="GC nucleic acid amplification test (NAAT)" AND Result_1="GC Positive" then NAATG="Pos";
if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="Culture" AND Result_1="GC Negative" then CULG="Neg";
if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="Culture" AND Result_1="GC Positive" then CULG="Pos";
if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="Culture" AND Result_1="Not Applicable (e.g. Culture positive for N. Meningitides)" then NONGCG="Pos";
run;
the log says this.
1107 set Labtests.July1719_Tests;
1108 if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="GC nucleic acid amplification
1108! test (NAAT)" AND Result_1="GC Negative" then NAATG="Neg";
1109 if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="GC nucleic acid amplification
1109! test (NAAT)" AND Result_1="GC Positive" then NAATG="Pos";
1110 if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="Culture" AND Result_1="GC
1110! Negative" then CULG="Neg";
1111 if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="Culture" AND Result_1="GC
1111! Positive" then CULG="Pos";
1112 if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="Culture" AND Result_1="Not
1112! Applicable (e.g. Culture positive for N. Meningitides)" then NONGCG="Pos";
1113 run;
NOTE: Character values have been converted to numeric values at the places given by:
(Line):(Column).
1108:29 1109:29 1110:29 1111:29 1112:29
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3313025 CollDate_1=11/10/2017 Specimen_1=Pharyngeal
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=1
NOTE: Invalid numeric data, 'Endocervical' , at line 1110 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1111 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1112 column 29.
IncidentID=3325056 CollDate_1=04/13/2018 Specimen_1=Pharyngeal TestType_1=Culture
Result_1=GC Negative F= G= H= NAATG= CULG= NONGCG= _ERROR_=1 _N_=2
NOTE: Invalid numeric data, 'Endocervical' , at line 1110 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1111 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1112 column 29.
IncidentID=3443578 CollDate_1=10/02/2018 Specimen_1=Rectal TestType_1=Culture Result_1=GC Negative
F= G= H= NAATG= CULG= NONGCG= _ERROR_=1 _N_=3
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3325011 CollDate_1=04/23/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=4
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3312054 CollDate_1=10/27/2017 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=6
NOTE: Invalid numeric data, 'Endocervical' , at line 1110 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1111 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1112 column 29.
IncidentID=2719247 CollDate_1=05/17/2017 Specimen_1=Pharyngeal TestType_1=Culture
Result_1=Not Applicable (e.g. Culture positive for N. Meningitides) F= G= H= NAATG= CULG=
NONGCG= _ERROR_=1 _N_=8
NOTE: Invalid numeric data, 'Endocervical' , at line 1110 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1111 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1112 column 29.
IncidentID=2990718 CollDate_1=12/06/2017 Specimen_1=Pharyngeal TestType_1=Culture
Result_1=GC Negative F= G= H= NAATG= CULG= NONGCG= _ERROR_=1 _N_=9
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=2935129 CollDate_1=10/12/2017 Specimen_1=Pharyngeal
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=13
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3100853 CollDate_1=02/19/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Positive F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=14
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3576407 CollDate_1=03/28/2019 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Positive F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=15
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3325029 CollDate_1=04/13/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=17
NOTE: Invalid numeric data, 'Endocervical' , at line 1110 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1111 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1112 column 29.
IncidentID=3497934 CollDate_1=01/15/2019 Specimen_1=Endocervical TestType_1=Culture
Result_1=GC Positive F= G= H= NAATG= CULG= NONGCG= _ERROR_=1 _N_=18
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=2842778 CollDate_1=08/22/2017 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=19
NOTE: Invalid numeric data, 'Endocervical' , at line 1110 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1111 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1112 column 29.
IncidentID=3328919 CollDate_1=03/26/2018 Specimen_1=Pharyngeal TestType_1=Culture
Result_1=GC Negative F= G= H= NAATG= CULG= NONGCG= _ERROR_=1 _N_=20
NOTE: Invalid numeric data, 'Endocervical' , at line 1110 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1111 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1112 column 29.
IncidentID=3283626 CollDate_1=05/01/2017 Specimen_1=Pharyngeal TestType_1=Culture
Result_1=GC Negative F= G= H= NAATG= CULG= NONGCG= _ERROR_=1 _N_=21
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3325010 CollDate_1=04/05/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=23
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3157807 CollDate_1=04/12/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Positive F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=24
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3326691 CollDate_1=05/25/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=25
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3328907 CollDate_1=03/27/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=26
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.
IncidentID=3327668 CollDate_1=01/19/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=28
NOTE: There were 5395 observations read from the data set LABTESTS.JULY1719_TESTS.
NOTE: The data set LABTESTS.JULY1719_TEST1A has 5395 observations and 11 variables.
NOTE: DATA statement used (Total process time):
real time 2.12 seconds
cpu time 0.14 seconds
1114 proc freq data=Labtests.July1719_Test1a;
1115 table NAATG CULG NONGCG;
1116 run;
NOTE: There were 5395 observations read from the data set LABTESTS.JULY1719_TEST1A.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.42 seconds
cpu time 0.01 seconds
proc freq data=Labtests.July1719_Test1a;
table NAATG CULG NONGCG;
run;
if Specimen_1 in("Urethral", "Endocervical)
not
if Specimen_1="Urethral" OR "Endocervical
if Specimen_1 in("Urethral", "Endocervical)
not
if Specimen_1="Urethral" OR "Endocervical
Also if you ever need to use AND and OR in the same statement, using parentheses is compulsory to avoid confusion.
This
if Specimen_1="Urethral" OR Specimen_1="Endocervical" AND TestType_1="Culture" AND Result_1="GC Negative" then CULG="Neg";
is not the same as
if (Specimen_1="Urethral" OR Specimen_1="Endocervical") AND TestType_1="Culture" AND Result_1="GC Negative" then CULG="Neg";
or
if Specimen_1="Urethral" OR (Specimen_1="Endocervical" AND TestType_1="Culture") AND Result_1="GC Negative" then CULG="Neg";
and so on.
Make your code easier to read and understand by grouping the conditions, and applying visual formatting:
data Labtests.July1719_Test1a;
set Labtests.July1719_Tests;
if Specimen_1="Urethral" OR "Endocervical"
then do;
if TestType_1="GC nucleic acid amplification test (NAAT)"
then do;
if Result_1="GC Negative" then NAATG="Neg";
if Result_1="GC Positive" then NAATG="Pos";
end;
if TestType_1="Culture"
then do;
if Result_1="GC Negative" then CULG="Neg";
if Result_1="GC Positive" then CULG="Pos";
if Result_1="Not Applicable (e.g. Culture positive for N. Meningitides)" then NONGCG="Pos";
end;
end;
run;
Apart from the obvious (OR connects logical statements, not strings) a good way to restructure your code could be with SELECT statements, e.g.:
data Labtests.July1719_Test1a;
set Labtests.July1719_Tests;
select(Specimen_1);
when("Urethral","Endocervical") select(TestType_1);
when("GC nucleic acid amplification test (NAAT)") select(Result_1);
when("GC Negative") NAATG="Neg";
when("GC Positive") NAATG="Pos";
otherwise;
end;
when("Culture") select(Result_1);
when("GC Negative") CULG="Neg";
when("GC Positive") CULG="Pos";
when("Not Applicable (e.g. Culture positive for N. Meningitides)") NONGCG="Pos";
otherwise;
end;
otherwise;
end;
otherwise;
end;
run;
One of the nice things about the SELECT statements (apart from the fact that you can write a list of possible values like shown) is that you implicitly get the ELSE statements that are missing from your code - you will get the right results with some of the other suggestions (e.g. the one from @Kurt_Bremser), but things will run faster with large data if you use ELSE and not just repeated IF statements.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.