BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PharmlyDoc
Quartz | Level 8

I have two tables, table 1 has one record per patient, table 2 has multiple records per patient with dummy variables. I want to join the tables so that the dummy variables are merged with table 1.  

Table 1.

PAT_ID AGE SEX
93704 20 F
93705 26 F
93706 28 M
93707 27 F
93708 47 M
93709 38 F
93710 44 F
93711 38 M
93712 24 F
93713 40 F
93714 40 F
93715 25 M
93716 23 M
93717 48 F
93718 41 F

 

Table 2.

PAT_ID DRUG STATIN BETA_BLOCKER ACEI DIURETIC STEROID
93705 atorvastatin 1 0 0 0 0
93705 metoprolol 0 1 0 0 0
93705 lisinopril 0 0 1 0 0
93709 hydrochlorothiazide 0 0 0 1 0
93709 rosuvastatin 1 0 0 0 0
93709 prednisone 0 0 0 0 1
93713 pravastatin 1 0 0 0 0
93715 carvedilol 0 1 0 0 0
93715 prednisone 0 0 0 0 1
93715 atorvastatin 1 0 0 0 0
93716 spironolactone 0 0 0 1 0
93717 lisinopril 0 0 1 0 0
93717 propranolol 0 1 0 0 0
93717 hydrochlorothiazide 0 0 0 1 0
93718 lisinopril 0 0 1 0 0

 

 

Want:

PAT_ID AGE SEX STATIN BETA_BLOCKER ACEI DIURETIC STEROID
93704 20 F 0 0 0 0 0
93705 26 F 1 1 1 0 0
93706 28 M 0 0 0 0 0
93707 27 F 0 0 0 0 0
93708 47 M 0 0 0 0 0
93709 38 F 1 0 0 1 1
93710 44 F 0 0 0 0 0
93711 38 M 0 0 0 0 0
93712 24 F 0 0 0 0 0
93713 40 F 1 0 0 0 0
93714 40 F 0 0 0 0 0
93715 25 M 1 1 0 0 1
93716 23 M 0 0 0 1 0
93717 48 F 0 1 1 1 0
93718 41 F 0 0 1 0 0

 

 

data table_1;
input PAT_ID AGE SEX $;
datalines;
93704 20 F
93705 26 F
93706 28 M
93707 27 F
93708 47 M
93709 38 F
93710 44 F
93711 38 M
93712 24 F
93713 40 F
93714 40 F
93715 25 M
93716 23 M
93717 48 F
93718 41 F
;

data table_2;
infile datalines dlm=',';
input PAT_ID DRUG : $19. STATIN BETA_BLOCKER ACEI DIURETIC STEROID;
datalines;
93705,atorvastatin,1,0,0,0,0
93705,metoprolol,0,1,0,0,0
93705,lisinopril,0,0,1,0,0
93709,hydrochlorothiazide,0,0,0,1,0
93709,rosuvastatin,1,0,0,0,0
93709,prednisone,0,0,0,0,1
93713,pravastatin,1,0,0,0,0
93715,carvedilol,0,1,0,0,0
93715,prednisone,0,0,0,0,1
93715,atorvastatin,1,0,0,0,0
93716,spironolactone,0,0,0,1,0
93717,lisinopril,0,0,1,0,0
93717,propranolol,0,1,0,0,0
93717,hydrochlorothiazide,0,0,0,1,0
93718,lisinopril,0,0,1,0,0
;
1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Using the SQL below should do the job.

data table_1;
input PAT_ID AGE SEX $;
datalines;
93704 20 F
93705 26 F
93706 28 M
93707 27 F
93708 47 M
93709 38 F
93710 44 F
93711 38 M
93712 24 F
93713 40 F
93714 40 F
93715 25 M
93716 23 M
93717 48 F
93718 41 F
;

data table_2;
infile datalines dlm=',';
input PAT_ID DRUG : $19. STATIN BETA_BLOCKER ACEI DIURETIC STEROID;
datalines;
93705,atorvastatin,1,0,0,0,0
93705,metoprolol,0,1,0,0,0
93705,lisinopril,0,0,1,0,0
93709,hydrochlorothiazide,0,0,0,1,0
93709,rosuvastatin,1,0,0,0,0
93709,prednisone,0,0,0,0,1
93713,pravastatin,1,0,0,0,0
93715,carvedilol,0,1,0,0,0
93715,prednisone,0,0,0,0,1
93715,atorvastatin,1,0,0,0,0
93716,spironolactone,0,0,0,1,0
93717,lisinopril,0,0,1,0,0
93717,propranolol,0,1,0,0,0
93717,hydrochlorothiazide,0,0,0,1,0
93718,lisinopril,0,0,1,0,0
;

proc sql;
  create table want as
  select 
    l.PAT_ID,
    l.AGE, 
    l.SEX,
    max(coalesce(r.STATIN,0)) as STATIN,
    max(coalesce(r.BETA_BLOCKER,0)) as BETA_BLOCKER,
    max(coalesce(r.ACEI,0)) as ACEI,
    max(coalesce(r.DIURETIC,0)) as DIURETIC,
    max(coalesce(r.STEROID,0)) as STEROID 
  from 
    table_1 l
    left join
    table_2 r
    on
      l.PAT_ID=r.PAT_ID
  group by l.PAT_ID, l.AGE, l.SEX
  ;
quit;

Patrick_0-1631329691851.png

 

 

View solution in original post

4 REPLIES 4
Shmuel
Garnet | Level 18

Having one to many relationship between the two tables. you can use sas base, merge statement:

 

data want;
merge table1
      table2(drop=drug);
  by pat_id;
run;
tarheel13
Rhodochrosite | Level 12

That gets more than 1 row per patient though.

Patrick
Opal | Level 21

Using the SQL below should do the job.

data table_1;
input PAT_ID AGE SEX $;
datalines;
93704 20 F
93705 26 F
93706 28 M
93707 27 F
93708 47 M
93709 38 F
93710 44 F
93711 38 M
93712 24 F
93713 40 F
93714 40 F
93715 25 M
93716 23 M
93717 48 F
93718 41 F
;

data table_2;
infile datalines dlm=',';
input PAT_ID DRUG : $19. STATIN BETA_BLOCKER ACEI DIURETIC STEROID;
datalines;
93705,atorvastatin,1,0,0,0,0
93705,metoprolol,0,1,0,0,0
93705,lisinopril,0,0,1,0,0
93709,hydrochlorothiazide,0,0,0,1,0
93709,rosuvastatin,1,0,0,0,0
93709,prednisone,0,0,0,0,1
93713,pravastatin,1,0,0,0,0
93715,carvedilol,0,1,0,0,0
93715,prednisone,0,0,0,0,1
93715,atorvastatin,1,0,0,0,0
93716,spironolactone,0,0,0,1,0
93717,lisinopril,0,0,1,0,0
93717,propranolol,0,1,0,0,0
93717,hydrochlorothiazide,0,0,0,1,0
93718,lisinopril,0,0,1,0,0
;

proc sql;
  create table want as
  select 
    l.PAT_ID,
    l.AGE, 
    l.SEX,
    max(coalesce(r.STATIN,0)) as STATIN,
    max(coalesce(r.BETA_BLOCKER,0)) as BETA_BLOCKER,
    max(coalesce(r.ACEI,0)) as ACEI,
    max(coalesce(r.DIURETIC,0)) as DIURETIC,
    max(coalesce(r.STEROID,0)) as STEROID 
  from 
    table_1 l
    left join
    table_2 r
    on
      l.PAT_ID=r.PAT_ID
  group by l.PAT_ID, l.AGE, l.SEX
  ;
quit;

Patrick_0-1631329691851.png

 

 

PharmlyDoc
Quartz | Level 8

@tarheel13 Thanks for answering,  I noticed that your method does not provide the correct output if a patient is taking more than one medication in the same drug class, for example 2 diuretics, and then distinct values for DIURETIC would be .,0,1,2 instead of just .,0,1. 

 

 

 

 

@Patrick 

Your method works.  Though lets say table_1  has 30 columns instead of 3, in which case I modified your code grouping only by PAT_ID and then used proc sort with the nodupkey option.

proc sql;
  create table want as
  select 
    l.*,
    max(coalesce(r.STATIN,0)) as STATIN,
    max(coalesce(r.BETA_BLOCKER,0)) as BETA_BLOCKER,
    max(coalesce(r.ACEI,0)) as ACEI,
    max(coalesce(r.DIURETIC,0)) as DIURETIC,
    max(coalesce(r.STEROID,0)) as STEROID 
  from 
    table_1 l
    left join
    table_2 r
    on
      l.PAT_ID=r.PAT_ID
  group by l.PAT_ID
  ;
quit;

PROC SORT DATA=want OUT=want_NODUP NODUPKEY;
BY PAT_ID;
RUN;

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1623 views
  • 1 like
  • 4 in conversation