Hi everyone, I have a dataset with over 15 million entries. I'm looking to compare the row below with the one above to look for all unique ID's that have a different value next to them. For example:
AB1234567, 145633431
AB1234567, 145633431
AB1234568, 145633431
AB1234568, 145633432
AB1234566, 145631513
AB1234565, 145635315
In this case, I am creating a counter or loop to look for all unique ID's that appear twice, but have a different value. In this case the only one is AB1234568. I want to ignore AB1234567 because it has the same value next to it both times.
I tried using a counter already and was able to find all values that came up twice but it was very inefficient and did not solve my problem:
Proc summary DATA=AUTO NWAY MISSING;
class NIV nopol;
VAR xyz;
OUTPUT OUT=DOUBLES(DROP=_TYPE_ _FREQ_) sum= ;
RUN;
proc sort data=DOUBLES;
by NIV NOPOL;
run;
DATA AUTO2;
SET DOUBLES;
IF XYZ=2;
RUN;
I have not used loops in SAS before but I am setting them up by using other people's code as an example. If anyone can help it's appreciated. Thanks
You can use either lag or retain functions too look at previous values:
data want; set have; if lag(id)=id and lag(value) ne value then new_flag=1; run;
Hi.
Assuming your data is grouped by ID but not sorted by group or by VALUE, here's one way using the lag function:
data want (rename=(_ID=ID));
set have end=_EOF;
drop ID VALUE _VALUE;
_ID=lag1(ID);
_VALUE=lag1(VALUE);
if _ID eq ID and _VALUE ne VALUE then COUNT+1;
if _EOF or _ID ne ID then do;
if COUNT gt 1 then output;
COUNT=1;
end;
run;
This will output the ID and the count of distinct values for VALUE if > 1
More on LAG function here: http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212547.htm
Hope it helps.
Daniel Santos @ www.cgd.pt
Unless I'm missing something, you're asking for something that SAS can do relatively easily:
data want;
set have;
by id value notsorted;
if first.value=0 or last.value=0;
run;
For the IDs you are looking for, it outputs multiple observations (not just one).
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.