BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nickspencer
Obsidian | Level 7
Hi all,
I am trying to create an indicator on table1 based on information on table2.
Table1
Patient_id comp_id date
123 A111 1/2/2021
234 A124 1/3/2021
567 A234 1/4/2021

Table2
Patient_id Comp_id Date State
123 A111 1/5/2020 AZ
123 A111 1/4/2020 AZ
123 A111 2/5/2021 PA
234 A124 1/3/2021 NY
So, I want to create a column called indicator in table1 and gave indicator value of 1 if the same patient and company is present in table2 and date in table1 is less than any date for that condition in table2 and state is AZ. That means only first row in table1 will have indicator of 1. I tried few ways but got duplicate in some cases. What may be the best way to achieve this with proc sql?
1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data one;
  input patient_id $ comp_id $ date :mmddyy.;
  format date yymmdd10.;
cards;
123 A111 1/2/2021
234 A124 1/3/2021
567 A234 1/4/2021

data two;
  input patient_id $ comp_id $ date :mmddyy. state $ ;
  format date yymmdd10.;
cards;
123 A111 1/5/2020 AZ
123 A111 1/4/2020 AZ
123 A111 2/5/2021 PA
234 A124 1/3/2021 NY
;

proc sql ;
create table want as
  select a.*,
   exists(select * from two where patient_id=a.patient_id and comp_id=a.comp_id and state='AZ' and date<a.date) as flag
  from one a 
;
quit;

View solution in original post

8 REPLIES 8
mkeintz
PROC Star

I think this is pretty simple in a data step. Make sure table1 and table2 are sorted by PATIENT_ID/COMP_ID.  Then:

data want;
  do until (last.comp_id);
    merge table1 table2 (rename=(date=date2));
    by patient_id comp_id;
    if state='AZ' and date2<date then indicator=1;
  end;
  drop date2 state;
run;

This assumes there is only one obs per PATIENT_ID/COMP_ID in table 1.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
nickspencer
Obsidian | Level 7
@mkeintz this is great and there is only one patiend_id/comp_id. But is there a way to achieve the same in proc sql?
ChrisNZ
Tourmaline | Level 20
Why use SQL when a data step implements the required logic so cleanly?
Tom
Super User Tom
Super User

What do you mean by duplicate?  The way in SQL to avoid duplicate is either DISTINCT keyword or GROUP BY.

data one;
  input patient_id $ comp_id $ date :mmddyy.;
  format date yymmdd10.;
cards;
123 A111 1/2/2021
234 A124 1/3/2021
567 A234 1/4/2021

data two;
  input patient_id $ comp_id $ date :mmddyy. state $ ;
  format date yymmdd10.;
cards;
123 A111 1/5/2020 AZ
123 A111 1/4/2020 AZ
123 A111 2/5/2021 PA
234 A124 1/3/2021 NY
;

proc sql ;
create table want as
  select a.patient_id, a.comp_id
       , max(b.state='AZ' ) as any_AZ
  from one a 
  left join two b
    on a.patient_id = b.patient_id
    and a.comp_id = b.comp_id 
    and a.date > b.date
  group by 1,2
;
quit;

proc print data=want;
run;
       patient_
Obs       id       comp_id    any_AZ

 1       123        A111         1
 2       234        A124         0
 3       567        A234         0

nickspencer
Obsidian | Level 7
@Tom this works great except that I have number of other columns in table1 to group by along with state column(may have different value). Is there a way to handle that?
Tom
Super User Tom
Super User

I have no idea what you mean.  Please show a clearer example with a least a couple of these other "columns" (did you mean input variables? output variables? levels of one or more of the input variables? something else?)

 

 

Ksharp
Super User
data one;
  input patient_id $ comp_id $ date :mmddyy.;
  format date yymmdd10.;
cards;
123 A111 1/2/2021
234 A124 1/3/2021
567 A234 1/4/2021

data two;
  input patient_id $ comp_id $ date :mmddyy. state $ ;
  format date yymmdd10.;
cards;
123 A111 1/5/2020 AZ
123 A111 1/4/2020 AZ
123 A111 2/5/2021 PA
234 A124 1/3/2021 NY
;

proc sql ;
create table want as
  select a.*,
   exists(select * from two where patient_id=a.patient_id and comp_id=a.comp_id and state='AZ' and date<a.date) as flag
  from one a 
;
quit;
ballardw
Super User

@nickspencer wrote:
Hi all,
I am trying to create an indicator on table1 based on information on table2.
Table1
Patient_id comp_id date
123 A111 1/2/2021
234 A124 1/3/2021
567 A234 1/4/2021

Table2
Patient_id Comp_id Date State
123 A111 1/5/2020 AZ
123 A111 1/4/2020 AZ
123 A111 2/5/2021 PA
234 A124 1/3/2021 NY
So, I want to create a column called indicator in table1 and gave indicator value of 1 if the same patient and company is present in table2 and date in table1 is less than any date for that condition in table2 and state is AZ. That means only first row in table1 will have indicator of 1. I tried few ways but got duplicate in some cases. What may be the best way to achieve this with proc sql?

 

LARGE ECONOMY SIZED HINT: If you want someone to improve on your code provide the code.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 5138 views
  • 4 likes
  • 6 in conversation