Compare values within a variable

Reply
Occasional Contributor
Posts: 10

Compare values within a variable


Hi, all. I need to compare date values within a variable. An example is the best way to describe this. Sample data is below. I need to be able to flag AABB as the same and BBCC as different (based on the collection date). Any suggestions?

Unique_IDCollection_IDCollection_Date
AABB11223-Jun
AABB11233-Jun
BBCC113315-Jan
BBCC11443-Jul
Grand Advisor
Posts: 17,428

Re: Compare values within a variable

What do you want the output to be?

You can either use First/Last processing with BY Groups

OR

LAG function.

Occasional Contributor
Posts: 10

Re: Compare values within a variable

Either a new variable or a new output data set that flags those that are the same.

Grand Advisor
Posts: 10,239

Re: Compare values within a variable

Are there ever more than 2 to compare? If so what is the rule? all the same, 2 of 3, more than half or something else for "sameness".

Grand Advisor
Posts: 17,428

Re: Compare values within a variable

This will create a unique group per identical unique_id, collection_id, collection_date. If you need more than this you'll need to expand on your question.

data want;

set have;

by unique_id collection_id collection_date;

retain group;

if first.collection_date then group=group+1;

run;

Grand Advisor
Posts: 9,593

Re: Compare values within a variable

So you don't care variable Collection_ID ? the only group variable is Unique_ID?

data have;
input Unique_ID     $ Collection_ID     Collection_Date     $;
cards;
AABB     1122     3-Jun
AABB     1123     3-Jun
BBCC     1133     15-Jan
BBCC     1144     3-Jul
;
run;
proc sql;
create table want as
 select *,case when(count(distinct Collection_Date)=1) then 'Same' else 'Diff' end as flag
  from have
   group by Unique_ID;
quit;

Xia Keshan

Occasional Contributor
Posts: 10

Re: Compare values within a variable

Thanks, all. The LAG function worked. Here is what I ended up doing:

proc sort data=have; by unique_id collection_id collection_date;

data want;

     set have;

     x=lag1(Collection_Date);

     if x=Collection_Date then flag='yes';

     else flag='no';

run;

Grand Advisor
Posts: 17,428

Re: Compare values within a variable

Would that work if your data was as below?

Amanda Brunton wrote:

Unique_ID Collection_ID Collection_Date
AABB 1122 3-Jun
AABB 1123 3-Jun
BBCC 1133 3-Jun
BBCC 1144 3-Jul
Occasional Contributor
Posts: 10

Re: Compare values within a variable

I thought of that too. It will be rare in my data set, but regardless I am working on a statement that will subsequently compare the Unique_ID.

Grand Advisor
Posts: 17,428

Re: Compare values within a variable

You'll probably get the correct code using that method but using first/last is still easier:

proc sort data=want;

by unique_id collection_date;

run;

data want;

set have;

by unique_id collection_date;

if not first.collection_date then flag="yes"; else flag="no";

run;

Ask a Question
Discussion stats
  • 9 replies
  • 560 views
  • 3 likes
  • 4 in conversation