BookmarkSubscribeRSS Feed
SannaSanna
Quartz | Level 8

Hi-

 

Asking this super SAS support group for some help.  I am trying to compare the contents (data) from two different datasets.  Both have the same variable names but the number of observations differ.  I tried proc compare to compare to obtain differences but it was difficult to interpret the proc compare results.  Can anyone help me on how I can go about producing an outpuf file upon comparing patient information from both datasets.  It would be helpful if the output would contain only patient data from from table A not matching patient data in table B.   Any help would be appreciated.  Thank you all in advance!

11 REPLIES 11
Reeza
Super User

Proc compare has it's limitations. A simple SQL query will generate what you want, the code below will select observations from TableA where the ID is not in TableB. You can replace the ID variable by your ID variable and table name with your table names. 

 

proc sql;
create table missing as
select * from tableA
where ID not in (select distinct ID from tableB);
quit;

Can you have multiple records per patient? How should that be handled?

SannaSanna
Quartz | Level 8

Hi-  Thank you so much for responding in my quest for help!! 

 

Yes.  There could be multiple records per patient and the patient record data could be reported from different location sites.  I want to compare the patient data at location ABC from Table A against the same patient data at location XYZ from Table B.   Thank you!

DeaT
Obsidian | Level 7

Maybe not very elegant, but this should do the trick.

 


/* Get some dummy data. */
%let varnames=age height weight;

data site1;
set sashelp.class;
age = age +1;
run;

data site2;
set sashelp.class;
run;


/* Rename your variables. */
%macro xxx (site);

proc datasets library=work memtype=data nolist;
 modify &site.;
 rename
 %do i=1 %to %sysfunc(countw(&varnames));
  %scan(&varnames,&i)= &site._&i
 %end;
 ;
quit;

%mend xxx;

%xxx (site1);
%xxx (site2);


/* Comparison. */
%macro yyy;

data comp;
    merge site1 (in = a) site2 (in = b);
    by name ; /* plus any other unique keys you may have */

    if a and b then flag =          "Site 1 & 2 ";
    else if a and not b then flag = "Site 1 only";
    else if not a and b then flag = "Site 2 only";

 %do i=1 %to %sysfunc(countw(&varnames));
   diff&i. = sum (site1_&i, -site2_&i);
 %end;

run;
%mend yyy;

%yyy;

SannaSanna
Quartz | Level 8

Thank you for your response.  I'm not too versed with macros but I'll pull my reference books out to review again.   Do you know if the SQL stmts below would work?   To get the differences and then another pass to get the intersects/commons?

 

Proc SQL;

  create table missing as

  select * from Tab1

     except

 select * from Tab2;

Quit;

 

Proc SQL;

  create table missing as

  select * from Tab1

     intersect

 select * from Tab2;

Quit;

 

SannaSanna
Quartz | Level 8

Hi-

 

Thank you for your help.  I am trying to compare 15 variables from Tab1 and Tab2.   Can you help explain what this part of your code does. 

 

/* Rename your variables. */

%macro xxx (site);

proc datasets library=work memtype=data nolist;

modify &site.;

rename

%do i=1 %to %sysfunc(countw(&varnames));

%scan(&varnames,&i)= &site._&i

%end;

;

quit;

%mend xxx;

%xxx (site1);

%xxx (site2);

 

Reeza
Super User

@SannaSanna Did you try the initial code I suggested?

SannaSanna
Quartz | Level 8

Hi Reeza,

 

Yes I did.  Unfortunately it didnt work.  I think its because I have duplicate IDs (which is valid for my dataset) But I dont think SAS liked it.  

Reeza
Super User

@SannaSanna wrote:

Hi Reeza,

 

Yes I did.  Unfortunately it didnt work.  I think its because I have duplicate IDs (which is valid for my dataset) But I dont think SAS liked it.  


 

 

@SannaSanna Duplicate ID's shouldn't have mattered, did you get an error or results that were unsatisfactory?

SannaSanna
Quartz | Level 8

Hi Reeza, 

 

Thank you so much for your help. 

 

** trying to compare Table A with Table B to determine contents in Table A changed/missing in Table B and produce output for additional review**

 

This is the code I applied which you suggested.   

proc sql;

create table missing as

select * from TABLEA

where PATIENTID LOCATION DATE PROC OUTCOME

not in (select distinct PATIENTID LOCATION DATE PROC OUTCOME from TABLEB);

quit;

 

Syntax error after PatientID

ERROR 22-322: Syntax error, expecting one of the following: ;, !, !!, &, (, *, **, +, -, '.', /, <, <=, <>, =, >, >=, ?, AND, BETWEEN, CONTAINS, EQ, EQT, EXCEPT, GE, GET, GROUP, GT, GTT, HAVING, IN,

INTERSECT, IS, LE, LET, LIKE, LT, LTT, NE, NET, NOT, NOTIN, OR, ORDER, OUTER, UNION, ^, ^=, |, ||, ~, ~=.

ERROR 76-322: Syntax error, statement will be ignored.

 

DeaT
Obsidian | Level 7

Hi Suzanne,

 

Tha code simply renames your variables into site1_1 (the first variable  from your first dataset, in the order you listed in varnames: %let varnames=age height weight;), site1_2 ( the second variable  from your first dataset),... site1_n, site2_1, etc. 

 

I assumed that you want to do a difference between the two tables (diff&i. = sum (site1_&i, -site2_&i);) which was the reason I renamed the variables in the first place.

 

If that's the case, to use that code do the following:

1) in %let varnames=age height weight; replace age, height, weight with your variables from the two datasets.

2) name your datasets site1 and site2 (or change their name when you call the macro xxx).

3) replace by name with by (your list of unique identifiers).

 

I hope this helps.

SannaSanna
Quartz | Level 8

Hi-

 

I think I almost got it to run correctly.  The code failed untimately.   The comparison code-  why is my data being converted from Character to Numeric?    The data contents within TableA and TableB are all string/characters/YN and no numeric.  And does the data need to be sorted first in the merge step?  << got error message too

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 6506 views
  • 1 like
  • 3 in conversation