SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Make sure every observations has been assigned to a subset

Reply
Occasional Contributor
Posts: 7

Make sure every observations has been assigned to a subset

Hi,

I have a questions here:

 

You have just written code that assigns each patient in your data set to one of four mutually exclusive subsetsHow would you check to make sure that every patient has been assigned to a subset and none are in more than one?  Knowing that every patient should fall into one of these categories, how would you handle those that don't?

 

Anyone has know how to solve this? Even some idea are okay. 

Many thanks,

Chen

Regular Contributor
Posts: 161

Re: Make sure every observations has been assigned to a subset

One simple way is to look at the log to find out if the counts from the parent dataset match with the sum of the subset datasets, that is if the subsets themselves are datasets.

 

Another easy was is that if you are using a variable that identifies where the subset records are written to, you may do a frequency of that variable and tally them up with the subsets.

 

Third way I can think of is that if there is a key, you can write a merge to see where each record has ended up by merging with the key.

 

You could possibly write a proc compare but I would guess one of the above options would have already displayed the information you are looking for...

 

Good Luck...!!!

 

Kannan Deivasigamani
Respected Advisor
Posts: 4,649

Re: Make sure every observations has been assigned to a subset

To get a list of the potentially problematic cases, try

 

proc sql;
select patientID, Subset
from myData
group by patientID
having count(distinct Subset) ne 1;
quit;

 How to correct those cases will depend on the meaning of those subsets. You could do it by hand if the number is small or have to refine your assignment procedure if there are too many.

PG
Occasional Contributor
Posts: 7

Re: Make sure every observations has been assigned to a subset

Hi PG,
I can't understand your writing totally, what does select subset mean?
If I have a dataset named myData, and all patients id in myData has set in to one of four mutually exclusive subsets (A, B, C, D), how could I make sure every id has in one of four subsets, no duplicate and no obsence?
Could you help me more, please?

Many thanks,
Chen
Respected Advisor
Posts: 4,649

Re: Make sure every observations has been assigned to a subset

My code supposes that your dataset myData contains two variables. The variable patientID identifies every patient uniquely. The variable Subset would take values A, B, C, or D. The SQL query will print the list of records of patients and their subsets only for patients with more than one subset, or no subset at all (missing subset value).

Instead of printing the records, you may create a new dataset with the troublesome records with:

 

proc sql;
create table myTroubleData as
select *
from myData
group by patientID
having count(distinct subset) ne 1;
quit;
PG
Super User
Posts: 6,938

Re: Make sure every observations has been assigned to a subset

if (condition one) then output dataset1;
else if (condition two) then output dataset2;
else if (condition three) then output dataset3;
else if (condition four) then output dataset4;
else abort;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 10,500

Re: Make sure every observations has been assigned to a subset

Not effecient by any means but relatively easy to understand:

 

proc freq data= have;

tables id * CodedVariable / norow nocol nopercent;

run;

 

Table with any ID and 2 or more values for the recode are problematic.

Then

proc print data=have;

where id in (<list the id values from above here>);

var id <and list of variables used in the recode> CodedVariable;

run;

That should give an idea of which rule(s) may not have been applied correctly.

Ask a Question
Discussion stats
  • 6 replies
  • 488 views
  • 0 likes
  • 5 in conversation