BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PharmlyDoc
Quartz | Level 8

I have two tables, table 1 has one record per patient, table 2 has multiple records per patient with dummy variables. I want to join the tables so that the dummy variables are merged with table 1.  

Table 1.

PAT_ID AGE SEX
93704 20 F
93705 26 F
93706 28 M
93707 27 F
93708 47 M
93709 38 F
93710 44 F
93711 38 M
93712 24 F
93713 40 F
93714 40 F
93715 25 M
93716 23 M
93717 48 F
93718 41 F

 

Table 2.

PAT_ID DRUG STATIN BETA_BLOCKER ACEI DIURETIC STEROID
93705 atorvastatin 1 0 0 0 0
93705 metoprolol 0 1 0 0 0
93705 lisinopril 0 0 1 0 0
93709 hydrochlorothiazide 0 0 0 1 0
93709 rosuvastatin 1 0 0 0 0
93709 prednisone 0 0 0 0 1
93713 pravastatin 1 0 0 0 0
93715 carvedilol 0 1 0 0 0
93715 prednisone 0 0 0 0 1
93715 atorvastatin 1 0 0 0 0
93716 spironolactone 0 0 0 1 0
93717 lisinopril 0 0 1 0 0
93717 propranolol 0 1 0 0 0
93717 hydrochlorothiazide 0 0 0 1 0
93718 lisinopril 0 0 1 0 0

 

 

Want:

PAT_ID AGE SEX STATIN BETA_BLOCKER ACEI DIURETIC STEROID
93704 20 F 0 0 0 0 0
93705 26 F 1 1 1 0 0
93706 28 M 0 0 0 0 0
93707 27 F 0 0 0 0 0
93708 47 M 0 0 0 0 0
93709 38 F 1 0 0 1 1
93710 44 F 0 0 0 0 0
93711 38 M 0 0 0 0 0
93712 24 F 0 0 0 0 0
93713 40 F 1 0 0 0 0
93714 40 F 0 0 0 0 0
93715 25 M 1 1 0 0 1
93716 23 M 0 0 0 1 0
93717 48 F 0 1 1 1 0
93718 41 F 0 0 1 0 0

 

 

data table_1;
input PAT_ID AGE SEX $;
datalines;
93704 20 F
93705 26 F
93706 28 M
93707 27 F
93708 47 M
93709 38 F
93710 44 F
93711 38 M
93712 24 F
93713 40 F
93714 40 F
93715 25 M
93716 23 M
93717 48 F
93718 41 F
;

data table_2;
infile datalines dlm=',';
input PAT_ID DRUG : $19. STATIN BETA_BLOCKER ACEI DIURETIC STEROID;
datalines;
93705,atorvastatin,1,0,0,0,0
93705,metoprolol,0,1,0,0,0
93705,lisinopril,0,0,1,0,0
93709,hydrochlorothiazide,0,0,0,1,0
93709,rosuvastatin,1,0,0,0,0
93709,prednisone,0,0,0,0,1
93713,pravastatin,1,0,0,0,0
93715,carvedilol,0,1,0,0,0
93715,prednisone,0,0,0,0,1
93715,atorvastatin,1,0,0,0,0
93716,spironolactone,0,0,0,1,0
93717,lisinopril,0,0,1,0,0
93717,propranolol,0,1,0,0,0
93717,hydrochlorothiazide,0,0,0,1,0
93718,lisinopril,0,0,1,0,0
;
1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Using the SQL below should do the job.

data table_1;
input PAT_ID AGE SEX $;
datalines;
93704 20 F
93705 26 F
93706 28 M
93707 27 F
93708 47 M
93709 38 F
93710 44 F
93711 38 M
93712 24 F
93713 40 F
93714 40 F
93715 25 M
93716 23 M
93717 48 F
93718 41 F
;

data table_2;
infile datalines dlm=',';
input PAT_ID DRUG : $19. STATIN BETA_BLOCKER ACEI DIURETIC STEROID;
datalines;
93705,atorvastatin,1,0,0,0,0
93705,metoprolol,0,1,0,0,0
93705,lisinopril,0,0,1,0,0
93709,hydrochlorothiazide,0,0,0,1,0
93709,rosuvastatin,1,0,0,0,0
93709,prednisone,0,0,0,0,1
93713,pravastatin,1,0,0,0,0
93715,carvedilol,0,1,0,0,0
93715,prednisone,0,0,0,0,1
93715,atorvastatin,1,0,0,0,0
93716,spironolactone,0,0,0,1,0
93717,lisinopril,0,0,1,0,0
93717,propranolol,0,1,0,0,0
93717,hydrochlorothiazide,0,0,0,1,0
93718,lisinopril,0,0,1,0,0
;

proc sql;
  create table want as
  select 
    l.PAT_ID,
    l.AGE, 
    l.SEX,
    max(coalesce(r.STATIN,0)) as STATIN,
    max(coalesce(r.BETA_BLOCKER,0)) as BETA_BLOCKER,
    max(coalesce(r.ACEI,0)) as ACEI,
    max(coalesce(r.DIURETIC,0)) as DIURETIC,
    max(coalesce(r.STEROID,0)) as STEROID 
  from 
    table_1 l
    left join
    table_2 r
    on
      l.PAT_ID=r.PAT_ID
  group by l.PAT_ID, l.AGE, l.SEX
  ;
quit;

Patrick_0-1631329691851.png

 

 

View solution in original post

4 REPLIES 4
Shmuel
Garnet | Level 18

Having one to many relationship between the two tables. you can use sas base, merge statement:

 

data want;
merge table1
      table2(drop=drug);
  by pat_id;
run;
tarheel13
Rhodochrosite | Level 12

That gets more than 1 row per patient though.

Patrick
Opal | Level 21

Using the SQL below should do the job.

data table_1;
input PAT_ID AGE SEX $;
datalines;
93704 20 F
93705 26 F
93706 28 M
93707 27 F
93708 47 M
93709 38 F
93710 44 F
93711 38 M
93712 24 F
93713 40 F
93714 40 F
93715 25 M
93716 23 M
93717 48 F
93718 41 F
;

data table_2;
infile datalines dlm=',';
input PAT_ID DRUG : $19. STATIN BETA_BLOCKER ACEI DIURETIC STEROID;
datalines;
93705,atorvastatin,1,0,0,0,0
93705,metoprolol,0,1,0,0,0
93705,lisinopril,0,0,1,0,0
93709,hydrochlorothiazide,0,0,0,1,0
93709,rosuvastatin,1,0,0,0,0
93709,prednisone,0,0,0,0,1
93713,pravastatin,1,0,0,0,0
93715,carvedilol,0,1,0,0,0
93715,prednisone,0,0,0,0,1
93715,atorvastatin,1,0,0,0,0
93716,spironolactone,0,0,0,1,0
93717,lisinopril,0,0,1,0,0
93717,propranolol,0,1,0,0,0
93717,hydrochlorothiazide,0,0,0,1,0
93718,lisinopril,0,0,1,0,0
;

proc sql;
  create table want as
  select 
    l.PAT_ID,
    l.AGE, 
    l.SEX,
    max(coalesce(r.STATIN,0)) as STATIN,
    max(coalesce(r.BETA_BLOCKER,0)) as BETA_BLOCKER,
    max(coalesce(r.ACEI,0)) as ACEI,
    max(coalesce(r.DIURETIC,0)) as DIURETIC,
    max(coalesce(r.STEROID,0)) as STEROID 
  from 
    table_1 l
    left join
    table_2 r
    on
      l.PAT_ID=r.PAT_ID
  group by l.PAT_ID, l.AGE, l.SEX
  ;
quit;

Patrick_0-1631329691851.png

 

 

PharmlyDoc
Quartz | Level 8

@tarheel13 Thanks for answering,  I noticed that your method does not provide the correct output if a patient is taking more than one medication in the same drug class, for example 2 diuretics, and then distinct values for DIURETIC would be .,0,1,2 instead of just .,0,1. 

 

 

 

 

@Patrick 

Your method works.  Though lets say table_1  has 30 columns instead of 3, in which case I modified your code grouping only by PAT_ID and then used proc sort with the nodupkey option.

proc sql;
  create table want as
  select 
    l.*,
    max(coalesce(r.STATIN,0)) as STATIN,
    max(coalesce(r.BETA_BLOCKER,0)) as BETA_BLOCKER,
    max(coalesce(r.ACEI,0)) as ACEI,
    max(coalesce(r.DIURETIC,0)) as DIURETIC,
    max(coalesce(r.STEROID,0)) as STEROID 
  from 
    table_1 l
    left join
    table_2 r
    on
      l.PAT_ID=r.PAT_ID
  group by l.PAT_ID
  ;
quit;

PROC SORT DATA=want OUT=want_NODUP NODUPKEY;
BY PAT_ID;
RUN;

 

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2508 views
  • 1 like
  • 4 in conversation