BookmarkSubscribeRSS Feed
James_an
Calcite | Level 5

Hi, i am new to sas and handling the patients data right now, but having trouble excluding patients

 

if i can use some of your help, i would be so grateful!

 

here is the dataset i have

 

ID/converted_date(yymmdd10. format)/drug_name

1/2011-01-01/A

1/2011-01-01/B

1/2011-05-01/A

2/2012-03-07/A

2/2012-06-25/A

3/2013-01-12/A

3/2013-01-12/B

4/2014-01-12/B

4/2015-06-02/B

 

In this example i need to extract or get ID of patients #1, #3 and 

#2,#4 will be included in the table. 

 

the dataset i have is about 50GB, so it is really heavy!!, it would be great if the code is simple and easy.

 

thank you!

 

6 REPLIES 6
japelin
Rhodochrosite | Level 12

If it's the same drug on the same day, would you leave it?
I don't know if this two ways are correct since the results you want are not shown, but you can try.

 

 

/* one step */
proc sql;
  create table want1 as
    select a.* from have as a 
    left join have as b 
    on a.id=b.id and 
       a.converted_date=b.converted_date and
       a.drug_name^=b.drug_name
    where b.drug_name=''
;
quit;

/* two steps: sort and sql */
proc sort data=have  out=dup UNIQUEOUT=uniq NOUNIQUEKEY;
  by id converted_date;
run;
proc sql;
  create table want2 as
    select * from uniq 
  union all 
    select a.* from dup as a 
    left join dup as b 
    on a.id=b.id and 
       a.converted_date=b.converted_date and
       a.drug_name^=b.drug_name
    where b.drug_name=''
;
quit;

 

 

 

 

Ksharp
Super User
data have;
infile cards dlm='/' truncover;
input id date :yymmdd10. drug $;
format date yymmdd10.;
cards;
1/2011-01-01/A
1/2011-01-01/B
1/2011-05-01/A
2/2012-03-07/A
2/2012-06-25/A
3/2013-01-12/A
3/2013-01-12/B
4/2014-01-12/B
4/2015-06-02/B
;

proc sql;
create table want as
select * from have 
 group by id
  having count(distinct drug)=2;
quit;
tarheel13
Rhodochrosite | Level 12

Why are you including #4 if it is not on the same day? 

James_an
Calcite | Level 5

because In case of #4, other drug was not prescribed on the same day! it will be remained!

andreas_lds
Jade | Level 19

So, what exactly to you want? Is the data sorted by patient and date?

 

proc freq data=have noprint;
   table id*date / out=work.count(drop= percent where=(count > 1));
run;

proc sort data=work.count out=work.sorted nodupkey;
   by id;
run;

proc print;run;

data work.want;
   set work.have;
   
   if _n_ = 1 then do;
      declare hash h(dataset: 'work.sorted');
      h.defineKey('id');
      h.defineDone();
   end;
   
   if h.check() = 0 then delete; /* patient who used two drugs on same day are removed */   
run;
mkeintz
PROC Star

If

  1. The data are sorted by ID/DATE
  2. No drug is issued twice in the same day for a given ID.

then this program works;

 

data want;
  set have;
  by id date;
  if not (first.date=1 and last.date=1);
run;

The excludes all instances in which there is only one obs per date.  All the others must satisfy your condition.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Catch up on SAS Innovate 2026

Nearly 200 sessions are now available on demand with the SAS Innovate Digital Pass.

Explore Now →
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 3481 views
  • 0 likes
  • 6 in conversation