Help using Base SAS procedures

Finding invalid ID numbers

Reply
N/A
Posts: 0

Finding invalid ID numbers

I hope this is the correct place for this posting. My problem may be better solved within a DATA step.

I have a table with two different ID fields: ParentID and SubID. In this table, ParentID is not unique. SubID is not unique either, and any one ParentID may be associated with multiple SubIDs. However, no SubID should appear for more than one different ParentID. But there are some records scattered in my table where this occurs, and I need to find them.

So this table would be fine if it looked like this:
ParentID, SubID
123, 990
123, 991
124, 992
124, 992

But this table would NOT be ok, because SubID 990 occurs for two different ParentIDs:
ParentID, SubID
123, 990
123, 991
124, 992
124, 990

My table has thousands of records, so eyeballing the data is not an option. How can I find the records with a SubID that appears for more than one ParentID, with SAS code?

Thank you,
Brett
Super User
Posts: 9,682

Re: Finding invalid ID numbers

En.Yes .SAS can do it.
[pre]
data temp;
infile datalines delimiter=' ,';
input parentid $ subid $;
datalines;
123, 990
123, 991
124, 992
124, 990
run;
proc sql feedback;
select *
from temp
group by subid
having count(distinct parentid ) ge 2;
quit;

[/pre]


Ksharp
Valued Guide
Posts: 632

Re: Finding invalid ID numbers

Since you are looking for unique or distinct parents within subjects, there will be multiple solutions. Here is a DATA step solution, but it requires a SORT and therefore more passes of the data than does KSharp's solution.
[pre]data id;
infile datalines dlm=',';
input parentid subjid;
datalines;
123, 990
123, 991
124, 992
124, 990
run;

proc sort data=id nodupkey;
by subjid parentid;
run;

* parents should be unique within subjects;
data goodid
badid;
set id;
by subjid;
if not (first.subjid and last.subjid) then output badid;
else output goodid;
run;

title good id;
proc print data=goodid;
run;

title bad id;
proc print data=badid;
run;
[/pre]
Ask a Question
Discussion stats
  • 2 replies
  • 117 views
  • 0 likes
  • 3 in conversation