BookmarkSubscribeRSS Feed
amanmurba
Calcite | Level 5

Hi everyone, I have a dataset with over 15 million entries. I'm looking to compare the row below with the one above to look for all unique ID's that have a different value next to them. For example:

 

AB1234567, 145633431
AB1234567, 145633431
AB1234568, 145633431
AB1234568, 145633432
AB1234566, 145631513
AB1234565, 145635315

 

In this case, I am creating a counter or loop to look for all unique ID's that appear twice, but have a different value. In this case the only one is AB1234568. I want to ignore AB1234567 because it has the same value next to it both times. 

 

I tried using a counter already and was able to find all values that came up twice but it was very inefficient and did not solve my problem:

Proc summary DATA=AUTO NWAY MISSING;

  class  NIV nopol;

  VAR   xyz;

  OUTPUT OUT=DOUBLES(DROP=_TYPE_ _FREQ_) sum= ;

RUN;

 

proc sort data=DOUBLES;

        by NIV NOPOL;

run;

DATA AUTO2;

SET DOUBLES;

IF XYZ=2;

RUN;

 

I have not used loops in SAS before but I am setting them up by using other people's code as an example. If anyone can help it's appreciated. Thanks

 

 

3 REPLIES 3
RW9
Diamond | Level 26 RW9
Diamond | Level 26

You can use either lag or retain functions too look at previous values:

data want;
  set have;
  if lag(id)=id and lag(value) ne value then new_flag=1;
run;
DanielSantos
Barite | Level 11

Hi.

 

Assuming your data is grouped by ID but not sorted by group or by VALUE, here's one way using the lag function:

 

data want (rename=(_ID=ID));
        set have end=_EOF;
        drop ID VALUE _VALUE;
        _ID=lag1(ID);
        _VALUE=lag1(VALUE);
        if _ID eq ID and _VALUE ne VALUE then COUNT+1;
        if _EOF or _ID ne ID then do;
           if COUNT gt 1 then output;
           COUNT=1;
        end;
run;

 

This will output the ID and the count of distinct values for VALUE if > 1

 

More on LAG function here: http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212547.htm


Hope it helps.

 

Daniel Santos @ www.cgd.pt

Astounding
PROC Star

Unless I'm missing something, you're asking for something that SAS can do relatively easily:

 

data want;

set have;

by id value notsorted;

if first.value=0 or last.value=0;

run;

 

For the IDs you are looking for, it outputs multiple observations (not just one).

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 4385 views
  • 0 likes
  • 4 in conversation