05-08-2017 01:13 PM
I have a dataset where I need to select a record if a record has data within all three metrics. For example,
I only want to pull data that has a numerator for all three metrics (Insulin, Anti-Coagulants, and Opioids). Although Record_ID 2 has a numerator in all three metrics, I do not want to pull it because it has a '0' for the Insulin metric. Does anyone have a recommendation to selecting data this way? The end table that I want would be:
Any assistance would be greatly appreciated.
05-08-2017 01:20 PM
Will any of your Record_Id ever have more than 3 records? Fewer than 3 records? If so what is the selection rule(s)?
If they have exactly 3 records will they always be Insulin, Anti-coagulants and Opiods?
05-08-2017 01:22 PM
Yes, they'll have multiple records by month from 2014 to present. The criteria for selection is that they need to have records for these three metrics to be counted.
05-08-2017 01:30 PM - edited 05-08-2017 01:37 PM
If you never have more than one record for a given id/metric then sort by id and metric, then count metrics (with numberator>0) within each id:
Editted change: (Looking at Art's solution revealed a flaw in my suggest - here's the revised response);
proc sort data=have (where=(numerator>0)) out=need;
by id metric;
do n_metric=1 by 1 until (last.id);
do until (last.id);
if n_metric=3 then output;
05-08-2017 01:33 PM
In that case I'd use a DOW loop to test whether each metric met the condition, e.g.:
proc sort data=have out=want; by Record_ID Metric; run; data want (drop=m1-m3); do until (last.Record_ID); set want; by Record_ID; if Metric eq 'Insulin' and Numerator gt 0 then m1=1; else if Metric eq 'Anti-Coagulants' and Numerator gt 0 then m2=1; else if Metric eq 'Opioids' and Numerator gt 0 then m3=1; end; do until (last.Record_ID); set want; by Record_ID; if sum(of m1-m3) eq 3 then output; end; run;
Art, CEO, AnalystFinder.com
05-09-2017 09:29 AM
data have; infile cards expandtabs truncover; input Record_ID Metric : $40. Numerator Denominator; cards; 1 Insulin 1 1000 1 Anti-Coagulants 3 1000 2 Insulin 0 3000 3 Insulin 1 2000 2 Anti-Coagulants 2 1000 4 Anti-Coagulants 1 4000 1 Opioids 1 1000 2 Opioids 1 2000 3 Insulin 2 1000 ; run; proc sql; select * from have group by Record_ID having count(distinct Metric)=3 and sum(Numerator=0)=0; quit;