Dear,
I need help in my code in data step three.
First I sorted data and ran a data step to get last record for 'id test date'.
Then I need find if 'test' in lab1 and lab2 and at least one obs where value= 'neg' then want='neg';
else if 'test' in lab1 and lab2 and at least one obs has the vaiable 'value' other than neg or blank then want='pos';
if 'test' in lab1 and lab2 and both values of variable 'value' are blank then want='un';
Thank you
output needed;
id test value date want
a lab1 neg 2015-10-10 neg
a lab2 neg 2015-10-10 neg
b lab1 neg 2015-10-01 neg
b lab2 2015-10-01 neg
c lab1 pos 2015-10-01 pos
c lab2 2015-10-01 pos
d lab1 eqi 2015-10-01 pos
d lab2 2015-10-01 pos
e lab1 2015-10-01 un
e lab2 2015-10-01 un
input
data one;
input id $1 test $3-6 value $8-10 date$11-21;
datalines;
a lab1 neg 2015-10-01
a lab2 neg 2015-10-01
a lab1 neg 2015-10-10
a lab2 neg 2015-10-10
b lab1 neg 2015-10-01
b lab2 2015-10-01
c lab1 pos 2015-10-01
c lab2 2015-10-01
d lab1 eqi 2015-10-01
d lab2 2015-10-01
e lab1 2015-10-01
e lab2 2015-10-01
;
proc sort data=one;
by id test date;
run;
data two;
set one;
by id test date;
if last.test;
run;
proc sort data=two;
by id date test;
run;
data three;
set two;
if test and lag(test) in ('lab1' 'lab2') and value and lag(value) in (('neg' 'neg') or ('neg' '')) then want='neg';
else if test and lag(test) in ('lab1' 'lab2') and value and lag(value) not in (('neg' 'neg') or ('' '')) then want='pos';
else if test and lag(test) in ('lab1' 'lab2') and value and lag(value) not in (('' '')) then want='un';
run;
It is not recommended to use lag function in a conditional statement.
Try next code (the 3rd step only):
data three;
set two;
by id;
retain lag_test lag_value;
lag_test = lag(test);
lag_value = lag(value);
if test ne lag_test then do;
if (value='neg' or lag_value='neg') then want='neg'; else
if (value=' ' and lag_value=' ') then want='un'; else want='pos';
end;
drop lag_test lag_value;
run;
Not sure if I follow your rules, but the following achieves your desired result for your example:
data one;
informat date yymmdd10.;
input id $ test $ value $ date;
format date date9.;
datalines;
a lab1 neg 2015-10-01
a lab2 neg 2015-10-01
a lab1 neg 2015-10-10
a lab2 neg 2015-10-10
b lab1 neg 2015-10-01
b lab2 . 2015-10-01
c lab1 pos 2015-10-01
c lab2 . 2015-10-01
d lab1 eqi 2015-10-01
d lab2 . 2015-10-01
e lab1 . 2015-10-01
e lab2 . 2015-10-01
;
data two;
set one (where=(test in ('lab1','lab2')));
if value eq 'eqi' then wanta='pos';
else if missing(value) then wanta='un';
else wanta=value;
run;
proc sort data=two;
by id descending test wanta descending date;
run;
proc sort data=two nodupkey;
by id descending test wanta;
run;
data want;
do until (last.id);
set two;
by id;
if wanta eq 'pos' then pos=1;
else if wanta eq 'neg' then neg=1;
else if wanta eq 'un' then un=1;
end;
do until (last.id);
set one;
by id;
if neg then do;
if not pos and not un then do;
if first.id then output;
end;
else do;
wanta='neg';
output;
end;
end;
else if pos then do;
wanta='pos';
output;
end;
else if not pos and not neg then output;
end;
run;
Art, CEO, AnalystFinder.com
Every call of the lag() function retrieves a value from the FIFO chain and feeds a new value into it, so you must call lag() only once per data step iteration for a given variable.
And
if test and
means that test is converted to numeric and used as a boolean value, which is not what you intended IMO.
Maxim 2: Read the log! It will alert you to such unwanted effects.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.