BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lmyers2
Obsidian | Level 7

Hello,

 

I'm trying to get a count of the distinct patients existing in two datasets. If I use proc sql and the patient numbers are the same in both datasets, they overwrite each other. Is there a proc sql query that scans 2 datasets and produces a number (in this case 3)?  Below are sample data including an example of output I'm looking for.

 

Current data 1

MRN

1

2

3

10

 

Current data 2

MRN

1

2

3

4

5

 

MRN_both

1

2

3

 

Best

Laura

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
You could look at PROC COMPARE but it's a bit of an overkill.

You could also do a SQL query.

proc sql;
create table want_sql as
select distinct t2.ID from current_data1 as t1 where t1.id in (select distinct t2.ID from current_data2 t2);
quit;

You could also do a data step merge.

data want_data;
merge current_data1 (in=t1) and current_data2 (in=t2);
by ID;
if t1 and t2;
run;

The Data step approach will not work if you have duplicate IDs in either tables but the SQL one will. You could de-dup either solution after though.

View solution in original post

3 REPLIES 3
mkeintz
PROC Star

@lmyers2 wrote:(emphasis mine)


Hello,

 

I'm trying to get a count of the distinct patients existing in two datasets. If I use proc sql and the patient numbers are the same in both datasets, they overwrite each other. Is there a proc sql query that scans 2 datasets and produces a number (in this case 3)?  Below are sample data including an example of output I'm looking for.

 

Current data 1

MRN

1

2

3

10

 

Current data 2

MRN

1

2

3

4

5

 

MRN_both

1

2

3

 

Best

Laura


Why is the fact that they "overwrite each other" a problem.  This means that you will get 1 observation per common MRN, which in turns means you will have only 3 observations - the number you want

 

proc sql noprint;
  create table _null_ as  select a.mrn from 
    data1 as a 
    join 
    data2 as b 
    on a.mrn=b.mrn;
quit;
%put &=sqlobs;

This creates table _NULL_ which is not an actual physical data set, but it does cause SQL to behave as if it were.  So there will be a count of qualifying matches in macrovar SQLOBS.  If you can an actual table, change _NULL_ to a dataset or table name.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Reeza
Super User
You could look at PROC COMPARE but it's a bit of an overkill.

You could also do a SQL query.

proc sql;
create table want_sql as
select distinct t2.ID from current_data1 as t1 where t1.id in (select distinct t2.ID from current_data2 t2);
quit;

You could also do a data step merge.

data want_data;
merge current_data1 (in=t1) and current_data2 (in=t2);
by ID;
if t1 and t2;
run;

The Data step approach will not work if you have duplicate IDs in either tables but the SQL one will. You could de-dup either solution after though.
Ksharp
Super User
data data1;
input MRN;
cards;
1
2
3
10
;
 

data data2;
input MRN;
cards;
1
2
3
4
5
;

proc sql;
create table want as
select mrn from data1
intersect
select mrn from data2;
quit;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

5 Steps to Your First Analytics Project Using SAS

For SAS newbies, this video is a great way to get started. James Harroun walks through the process using SAS Studio for SAS OnDemand for Academics, but the same steps apply to any analytics project.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1429 views
  • 0 likes
  • 4 in conversation