Comparing data in first row with nth row.

Reply
New Contributor
Posts: 4

Comparing data in first row with nth row.

Here is example what I am trying to dd0

S- Satisfactory

US -Unsatisfactory

Data

Household#     Staff                       Status     Attempt Date

111                    Aj                         S               24/09/2014

111                    Aj                         US             22/09/2014

111                    Pj                         US              20/09/2014

111                    Tj                         US               18/09/2014

112                    Kj                         S                    24/09/2014

112                    Kj                         US                   22/09/2014

112                    Lj                           US               30/08/2014

112                    Lj                         US                   28/08/2014

112                    Lj                         US                    20/08/2014

I have mix of these data. I want to compare data from first row with nth row and separate those staff (Pj and Tj) had US for particular household. As I want to count their attempts separately (ignore Aj from data set completely). Any idea would be much appreciated.

Super Contributor
Posts: 490

Re: Comparing data in first row with nth row.

1 ) For the comparison could you give example of the desired output?

2 ) For separate those staff (Pj and Tj) had US :

You can use Where statement including the desired condition which is:

  1. staff in ('Pj','Tj')
  2. status ='US'

For example

data test;

input Household  $   Staff   $     Status  $  ;

datalines;

111                    Aj                         S          

111                    Aj                         US        

111                    Pj                         US         

111                    Tj                         US          

112                    Kj                         S              

112                    Kj                         US            

112                    Lj                         US        

112                    Lj                         US              

;

data newtest;

set test;

where staff in ('Pj','Tj') and status ='US';

run;

3) If you want to create two data set for those who belong to (Pj, Tj) and one for the others, you can use do that with explicit output statement as follow:

data test1 test2;

set test;

if staff in ('Pj','Tj') and status ='US' then output test1;

else output test2;

run;

New Contributor
Posts: 4

Re: Comparing data in first row with nth row.

I want to compare first.staff with other set and output those cases where first.staff doesn't equal second and so on.

Super Contributor
Posts: 490

Re: Comparing data in first row with nth row.

Check this, it will print all observations which does not totally match with the first observation.

data ok;

retain var1-var3;

set test;

if _n_=1 then do;

var1=household;

var2=Staff;

var3=status;

end;

else if not (household = var1 and staff = var2 and status = var3)then output;

run;

New Contributor
Posts: 4

Re: Comparing data in first row with nth row.

Looks promising.

I will try this. Thanks a lot.

Esteemed Advisor
Esteemed Advisor
Posts: 7,203

Re: Comparing data in first row with nth row.

Sorry, please clarify.  You have one or how many datasets?  If its just one dataset then what do you want out?  You talk about "other set" but do not explain what this means.  In relation to a check on row 1, then lag() with the previous row would avoid the whole retaining extra variables.

If lag(household) ne household and lag(staff) ne staff and lag(status) ne status then output;

New Contributor
Posts: 4

Re: Comparing data in first row with nth row.

I have one data set. And I would like to compare set of rows (households) to other rows. So I want to have output where staff name is different or status is US but leave those who had S status in the end. So data set I need to separate

Household #          Staff               Status          Attempt Date

111                         Pj                         US               20/09/2014

111                         Tj                         US                 18/09/2014

112                         Lj                         US                  30/08/2014

112                         Lj                         US                    28/08/2014

112                         Lj                         US                     20/08/2014

So my problem  is taking out rows of staff with S status. Is there any way that I can compare first row data with other row until staff >= first.staff.

Esteemed Advisor
Posts: 7,293

Re: Comparing data in first row with nth row.

If you always have one (and only one) Status for each household, then you might be able to use something like:

data want (drop=_Smiley Happy ;

  set have (where=(_status eq 'S') keep=status staff

    rename=(status=_status staff=_staff));

  do until (last.Household); 

    set have (where=(status ne 'S'));

    by Household; 

    if staff gt _staff then output;

  end; 

run;

Esteemed Advisor
Esteemed Advisor
Posts: 7,203

Re: Comparing data in first row with nth row.

Still not sure what you need to be "comparing".  It seems to me from your description that all you need to do is:

proc sort data=have;

     by household staff descending attempt_date;

     where status ne "S";

run;

data want;

     set have;

     by household staff;

     if first.staff then output;

run;

This will give you no records with S, and only one record with the last date.

Ask a Question
Discussion stats
  • 8 replies
  • 383 views
  • 1 like
  • 4 in conversation