DATA Step, Macro, Functions and more

how to create a new variable by comparing two obs

Reply
Super Contributor
Posts: 306

how to create a new variable by comparing two obs

Dear,

 

I need help in my code in data step three.

 

First I sorted data and ran a data step to get last record for 'id test date'.

 

Then I need find if 'test' in lab1 and lab2 and at least one obs where value= 'neg' then want='neg';

else if 'test' in lab1 and lab2 and at least one obs has the vaiable 'value'  other than neg or blank then want='pos';

if 'test' in lab1 and lab2 and both values of variable 'value' are blank then want='un';

 

Thank you

 

output needed;

id test value date           want
a lab1 neg 2015-10-10 neg a lab2 neg 2015-10-10 neg b lab1 neg 2015-10-01 neg b lab2 2015-10-01 neg c lab1 pos 2015-10-01 pos c lab2 2015-10-01 pos d lab1 eqi 2015-10-01 pos d lab2 2015-10-01 pos e lab1 2015-10-01 un e lab2 2015-10-01 un

 

input

data one;
input id $1 test $3-6 value $8-10 date$11-21;
datalines;
a lab1 neg 2015-10-01
a lab2 neg 2015-10-01
a lab1 neg 2015-10-10
a lab2 neg 2015-10-10
b lab1 neg 2015-10-01
b lab2     2015-10-01       
c lab1 pos 2015-10-01
c lab2     2015-10-01
d lab1 eqi 2015-10-01
d lab2     2015-10-01
e lab1     2015-10-01
e lab2     2015-10-01
;
proc sort data=one;
by id test date;
run;

data two;
set one;
by id test date;
if last.test;
run;
proc sort data=two;
by id date test;
run;

data three;
set two;
if test and lag(test) in ('lab1' 'lab2') and value and lag(value) in (('neg' 'neg') or ('neg' '')) then want='neg';
else if test and lag(test) in ('lab1' 'lab2') and value and lag(value) not in (('neg' 'neg') or ('' '')) then want='pos';
else if test and lag(test) in ('lab1' 'lab2') and value and lag(value) not in (('' '')) then want='un';
run;

 

Trusted Advisor
Posts: 1,822

Re: how to create a new variable by comparing two obs

[ Edited ]
Posted in reply to knveraraju91

It is not recommended to use lag function in a conditional statement.

Try next code (the 3rd step only):

data three;
 set two;
  by id;
     retain lag_test lag_value;
     lag_test = lag(test);
     lag_value = lag(value);

     if test ne lag_test then do; 
        if (value='neg' or lag_value='neg') then want='neg'; else
        if (value=' ' and lag_value=' ')  then want='un'; else want='pos';
     end;
     drop lag_test lag_value;
run;

 

PROC Star
Posts: 8,096

Re: how to create a new variable by comparing two obs

Posted in reply to knveraraju91

Not sure if I follow your rules, but the following achieves your desired result for your example:

data one;
  informat date yymmdd10.;
  input id $ test $ value $ date;
  format date date9.;
  datalines;
a lab1 neg 2015-10-01
a lab2 neg 2015-10-01
a lab1 neg 2015-10-10
a lab2 neg 2015-10-10
b lab1 neg 2015-10-01
b lab2 .   2015-10-01       
c lab1 pos 2015-10-01
c lab2 .   2015-10-01
d lab1 eqi 2015-10-01
d lab2 .   2015-10-01
e lab1 .   2015-10-01
e lab2 .   2015-10-01
;

data two;
  set one (where=(test in ('lab1','lab2')));
  if value eq 'eqi' then wanta='pos';
  else if missing(value) then wanta='un';
  else wanta=value;
run;

proc sort data=two;
  by id descending test wanta descending date;
run;

proc sort data=two nodupkey;
  by id descending test wanta;
run;

data want;
  do until (last.id);
    set two;
    by id;
    if wanta eq 'pos' then pos=1;
    else if wanta eq 'neg' then neg=1;
    else if wanta eq 'un' then un=1;
  end;
  do until (last.id);
    set one;
    by id;
    if neg then do;
      if not pos and not un then do;
        if first.id then output;
      end;
      else do;
        wanta='neg';
        output;
      end;
    end;
    else if pos then do;
      wanta='pos';
      output;
    end;
    else if not pos and not neg then output;
  end;
run;

Art, CEO, AnalystFinder.com

 

Super User
Posts: 9,550

Re: how to create a new variable by comparing two obs

Posted in reply to knveraraju91

Every call of the lag() function retrieves a value from the FIFO chain and feeds a new value into it, so you must call lag() only once per data step iteration for a given variable.

And

if test and

means that test is converted to numeric and used as a boolean value, which is not what you intended IMO.

Maxim 2: Read the log! It will alert you to such unwanted effects.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Ask a Question
Discussion stats
  • 3 replies
  • 126 views
  • 3 likes
  • 4 in conversation