Please tell me what is wrong with this code;
Data Labtests.July1719_Test1a;
set Labtests.July1719_Tests;
if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="GC nucleic acid amplification test (NAAT)" AND Result_1="GC Negative" then NAATG="Neg";
if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="GC nucleic acid amplification test (NAAT)" AND Result_1="GC Positive" then NAATG="Pos";
if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="Culture" AND Result_1="GC Negative" then CULG="Neg";
if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="Culture" AND Result_1="GC Positive" then CULG="Pos";
if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="Culture" AND Result_1="Not Applicable (e.g. Culture positive for N. Meningitides)" then NONGCG="Pos";
run;
the log says this.
1107 set Labtests.July1719_Tests;
1108 if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="GC nucleic acid amplification
1108! test (NAAT)" AND Result_1="GC Negative" then NAATG="Neg";
1109 if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="GC nucleic acid amplification
1109! test (NAAT)" AND Result_1="GC Positive" then NAATG="Pos";
1110 if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="Culture" AND Result_1="GC
1110! Negative" then CULG="Neg";
1111 if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="Culture" AND Result_1="GC
1111! Positive" then CULG="Pos";
1112 if Specimen_1="Urethral" OR "Endocervical" AND TestType_1="Culture" AND Result_1="Not
1112! Applicable (e.g. Culture positive for N. Meningitides)" then NONGCG="Pos";
1113 run;
NOTE: Character values have been converted to numeric values at the places given by:
(Line):(Column).
1108:29 1109:29 1110:29 1111:29 1112:29
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3313025 CollDate_1=11/10/2017 Specimen_1=Pharyngeal
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=1
NOTE: Invalid numeric data, 'Endocervical' , at line 1110 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1111 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1112 column 29.
IncidentID=3325056 CollDate_1=04/13/2018 Specimen_1=Pharyngeal TestType_1=Culture
Result_1=GC Negative F= G= H= NAATG= CULG= NONGCG= _ERROR_=1 _N_=2
NOTE: Invalid numeric data, 'Endocervical' , at line 1110 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1111 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1112 column 29.
IncidentID=3443578 CollDate_1=10/02/2018 Specimen_1=Rectal TestType_1=Culture Result_1=GC Negative
F= G= H= NAATG= CULG= NONGCG= _ERROR_=1 _N_=3
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3325011 CollDate_1=04/23/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=4
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3312054 CollDate_1=10/27/2017 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=6
NOTE: Invalid numeric data, 'Endocervical' , at line 1110 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1111 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1112 column 29.
IncidentID=2719247 CollDate_1=05/17/2017 Specimen_1=Pharyngeal TestType_1=Culture
Result_1=Not Applicable (e.g. Culture positive for N. Meningitides) F= G= H= NAATG= CULG=
NONGCG= _ERROR_=1 _N_=8
NOTE: Invalid numeric data, 'Endocervical' , at line 1110 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1111 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1112 column 29.
IncidentID=2990718 CollDate_1=12/06/2017 Specimen_1=Pharyngeal TestType_1=Culture
Result_1=GC Negative F= G= H= NAATG= CULG= NONGCG= _ERROR_=1 _N_=9
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=2935129 CollDate_1=10/12/2017 Specimen_1=Pharyngeal
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=13
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3100853 CollDate_1=02/19/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Positive F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=14
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3576407 CollDate_1=03/28/2019 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Positive F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=15
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3325029 CollDate_1=04/13/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=17
NOTE: Invalid numeric data, 'Endocervical' , at line 1110 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1111 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1112 column 29.
IncidentID=3497934 CollDate_1=01/15/2019 Specimen_1=Endocervical TestType_1=Culture
Result_1=GC Positive F= G= H= NAATG= CULG= NONGCG= _ERROR_=1 _N_=18
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=2842778 CollDate_1=08/22/2017 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=19
NOTE: Invalid numeric data, 'Endocervical' , at line 1110 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1111 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1112 column 29.
IncidentID=3328919 CollDate_1=03/26/2018 Specimen_1=Pharyngeal TestType_1=Culture
Result_1=GC Negative F= G= H= NAATG= CULG= NONGCG= _ERROR_=1 _N_=20
NOTE: Invalid numeric data, 'Endocervical' , at line 1110 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1111 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1112 column 29.
IncidentID=3283626 CollDate_1=05/01/2017 Specimen_1=Pharyngeal TestType_1=Culture
Result_1=GC Negative F= G= H= NAATG= CULG= NONGCG= _ERROR_=1 _N_=21
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3325010 CollDate_1=04/05/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=23
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3157807 CollDate_1=04/12/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Positive F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=24
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3326691 CollDate_1=05/25/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=25
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
IncidentID=3328907 CollDate_1=03/27/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=26
NOTE: Invalid numeric data, 'Endocervical' , at line 1108 column 29.
NOTE: Invalid numeric data, 'Endocervical' , at line 1109 column 29.
WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.
IncidentID=3327668 CollDate_1=01/19/2018 Specimen_1=Urine
TestType_1=GC nucleic acid amplification test (NAAT) Result_1=GC Negative F= G= H= NAATG=
CULG= NONGCG= _ERROR_=1 _N_=28
NOTE: There were 5395 observations read from the data set LABTESTS.JULY1719_TESTS.
NOTE: The data set LABTESTS.JULY1719_TEST1A has 5395 observations and 11 variables.
NOTE: DATA statement used (Total process time):
real time 2.12 seconds
cpu time 0.14 seconds
1114 proc freq data=Labtests.July1719_Test1a;
1115 table NAATG CULG NONGCG;
1116 run;
NOTE: There were 5395 observations read from the data set LABTESTS.JULY1719_TEST1A.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.42 seconds
cpu time 0.01 seconds
proc freq data=Labtests.July1719_Test1a;
table NAATG CULG NONGCG;
run;
if Specimen_1 in("Urethral", "Endocervical)
not
if Specimen_1="Urethral" OR "Endocervical
if Specimen_1 in("Urethral", "Endocervical)
not
if Specimen_1="Urethral" OR "Endocervical
Also if you ever need to use AND and OR in the same statement, using parentheses is compulsory to avoid confusion.
This
if Specimen_1="Urethral" OR Specimen_1="Endocervical" AND TestType_1="Culture" AND Result_1="GC Negative" then CULG="Neg";
is not the same as
if (Specimen_1="Urethral" OR Specimen_1="Endocervical") AND TestType_1="Culture" AND Result_1="GC Negative" then CULG="Neg";
or
if Specimen_1="Urethral" OR (Specimen_1="Endocervical" AND TestType_1="Culture") AND Result_1="GC Negative" then CULG="Neg";
and so on.
Make your code easier to read and understand by grouping the conditions, and applying visual formatting:
data Labtests.July1719_Test1a;
set Labtests.July1719_Tests;
if Specimen_1="Urethral" OR "Endocervical"
then do;
if TestType_1="GC nucleic acid amplification test (NAAT)"
then do;
if Result_1="GC Negative" then NAATG="Neg";
if Result_1="GC Positive" then NAATG="Pos";
end;
if TestType_1="Culture"
then do;
if Result_1="GC Negative" then CULG="Neg";
if Result_1="GC Positive" then CULG="Pos";
if Result_1="Not Applicable (e.g. Culture positive for N. Meningitides)" then NONGCG="Pos";
end;
end;
run;
Apart from the obvious (OR connects logical statements, not strings) a good way to restructure your code could be with SELECT statements, e.g.:
data Labtests.July1719_Test1a;
set Labtests.July1719_Tests;
select(Specimen_1);
when("Urethral","Endocervical") select(TestType_1);
when("GC nucleic acid amplification test (NAAT)") select(Result_1);
when("GC Negative") NAATG="Neg";
when("GC Positive") NAATG="Pos";
otherwise;
end;
when("Culture") select(Result_1);
when("GC Negative") CULG="Neg";
when("GC Positive") CULG="Pos";
when("Not Applicable (e.g. Culture positive for N. Meningitides)") NONGCG="Pos";
otherwise;
end;
otherwise;
end;
otherwise;
end;
run;
One of the nice things about the SELECT statements (apart from the fact that you can write a list of possible values like shown) is that you implicitly get the ELSE statements that are missing from your code - you will get the right results with some of the other suggestions (e.g. the one from @Kurt_Bremser), but things will run faster with large data if you use ELSE and not just repeated IF statements.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.