I have data of something like this:
ID EVENT SCORE
1 1 3
1 2 3
2 1 7
2 2 7
3 1 8
3 2 6
4 1 .
4 2 5
I care about scores not matching for an ID. That is IDs 3 and 4 would be of interest. Is there a way to view only those IDs keeping in mind that my actual dataset has many more IDs than this?
Thanks in advance.
The important part may be more in the realm of how many events?
Is your Score variable numeric or character? I have to ask because there are different approaches available for different types of data.
Do you just want the ID's of interest or are there other variables that need to be in the output?
It is always a good idea to show the desired output should look like.
I have around 8 events in total, but only 2 of the events contain the scores I am interested in. These are stored as numeric.
Ideally I want my dataset to contain the single score available and the matching scores. To decide what to do with the scores that are different, I want to be able to view those ID observations and decide from there if they are something to keep.
@amarikow57 wrote:
I have around 8 events in total, but only 2 of the events contain the scores I am interested in. These are stored as numeric.
Ideally I want my dataset to contain the single score available and the matching scores. To decide what to do with the scores that are different, I want to be able to view those ID observations and decide from there if they are something to keep.
Doesn't quite tell me how you expect the output, and brings up the need for an example with up to 8 events, such as how do we know which scores for which events are of interest.
I would be tempted to start with something like this to identify the mismatches:
proc summary data=have nway; class id; var score; output out=summary(drop=_: where=(score_range ne 0 or score_nmiss>0) ) min= max= range= nmiss= /autoname; ; run;
Range is the largest value - lowest, so if not equal to 0 then there is a difference for a numeric variable.
However when you have missing the Range only uses non-missing values so wouldn't catch the ones with missing. So this also counts the number of Nmiss, missing , values.
data have;
input ID EVENT SCORE;
cards;
1 1 3
1 2 3
2 1 7
2 2 7
3 1 8
3 2 6
4 1 .
4 2 5
;
proc sql;
create table want as
select *
from have
group by id
having count(distinct score) > 1 or ( n(score) and nmiss(score))
;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.