Solved: How can I flag multiple variable having same value for a particular su...

nirsan · Posted 07-28-2020 07:53 PM

Astounding · Posted 07-28-2020 10:51 PM

Here's a more complete version. It assumes your variables are character. If they are numeric, you would need to switch from SORTC to SORTN.

data want (keep=id);
set have;
array hlth {20};
call sortc (of hlth{*});
do _n_=1 to 19 until (flag=1);
   if not missing (hlth{_n_}) and hlth{_n_} = hlth{_n_+1} then flag=1;
end;
if flag=1;
run;

This version keeps just the ID. You would have to go back to the original data to check why these IDs are flagged.

View solution in original post

mkeintz · Posted 07-28-2020 08:10 PM

What do you want the output to look like? Will you test for all duplicates within an ID, or just a dummy indicating duplicates have been found? How do you want the test results presented?
Do you want tested code? Is so, please provide a sampe dataset in the form of a DATA step. Help us help you.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

nirsan · Posted 07-28-2020 08:22 PM

mkeintz · Posted 07-28-2020 08:40 PM

For each observation, you could successively compare the largest HLTH code to the 2nd largest, then the 2nd largest to the 3rd largest until you either exhaust all the non-missing values (so flag would be 0), or you find a duplicate (flag=1).

To do that you can use the LARGEST function and the LAG function, as in:

data want (drop=_:);
  set have;
  flag=0;
  do _L=1 to n(of hlth:) while (flag=0);
    _x=largest(_L,of hlth:);
    if _L>1 and _x=lag(_x) then flag=1;
  end;
run;

I'm not going to test this on your data -

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Astounding · Posted 07-28-2020 09:05 PM

As long as you copy the data to another data set (because we're going to change data values), it should be easy enough.

Put the variables into an array, then use call sortc to change the order of the variables.

Then move through the array and compare whether two consecutive values are identical (being careful not to take two consecutive missing values).

Let me know if you need help with this.

Astounding · Posted 07-28-2020 10:51 PM

Here's a more complete version. It assumes your variables are character. If they are numeric, you would need to switch from SORTC to SORTN.

data want (keep=id);
set have;
array hlth {20};
call sortc (of hlth{*});
do _n_=1 to 19 until (flag=1);
   if not missing (hlth{_n_}) and hlth{_n_} = hlth{_n_+1} then flag=1;
end;
if flag=1;
run;

This version keeps just the ID. You would have to go back to the original data to check why these IDs are flagged.

How can I flag multiple variable having same value for a particular subject?

Re: How can I flag multiple variable having same value for a particular subject?

Re: How can I flag multiple variable having same value for a particular subject?

Re: How can I flag multiple variable having same value for a particular subject?

Re: How can I flag multiple variable having same value for a particular subject?

Re: How can I flag multiple variable having same value for a particular subject?

Re: How can I flag multiple variable having same value for a particular subject?

How can I flag multiple variable having same value for a particular subject?

Re: How can I flag multiple variable having same value for a particular subject?

Re: How can I flag multiple variable having same value for a particular subject?

Re: How can I flag multiple variable having same value for a particular subject?

Re: How can I flag multiple variable having same value for a particular subject?

Re: How can I flag multiple variable having same value for a particular subject?

Re: How can I flag multiple variable having same value for a particular subject?

The 2025 SAS Hackathon has begun!

SAS Training: Just a Click Away