BookmarkSubscribeRSS Feed
PharmDoc
Fluorite | Level 6

I have a dataset called impact with three columns that each contain a visit number: 

visit_number_a

visit_number_b

visit_number_c 

 

How do I see if any values (the count) of visit_number_a match visit_number_b? 

And if any values of visit_number_a match to visit_number_b and to visit_number_c? 

And if so, how do I generate a list of the values that match across the columns? 

 

 

Would I use proc compare? But only include values where the difference equals zero? 

The visit numbers are 11 digits long. 

 

proc compare base =impact;
var visit_number_a;
with visit_number_b;
where visit_number_a - visit_number_b = 0;
run;

 

??????

 

Thank you. 

 

 

3 REPLIES 3
ballardw
Super User

Proc compare does a record by record comparison.

 

When you say : values (the count) of visit_number_a match visit_number_b?

it is hard without any example to tell if you need to summarize data to get counts, if the value of the variable is a count or something else. And with "visit numbers" 11 digits it sounds like your values are identifications. So not really clear here what is wanted.

 

I suggest providing some example data in the form of a data step and then show what the expected result for that would be.

If you are not sure how to write a data step it might look something like:

data have;
   input visit_number_a :$11. visit_number_b :$11. visit_number_c :$11;
datalines;
12345123451 23456234562 34567345673
12345123451 33333333333 44444444444
;
 

Provide enough values to demonstrate what want done and show the expected result for your given example data.

 

I used character values because I don't think you are doing arithmetic with the values.

 

PharmDoc
Fluorite | Level 6

@ballardw 

 

 

 

data impact;
input patient_id :$5. visit_number_a :$11. visit_number_b :$11. visit_number_c :$11.;
datalines;
Z6049 24942400656 85960815714 85960815714
Z6052 49415147535 81413953798 81413953798
Z6053 66886694586 66886694586 66886694586
Z6054 76875635808 76875635808 78586120861
Z6056 86538343903 20060379083 20060379083
Z6095 18466992548 12088652966 12088652966
Z6100 32224422523 32224422523 32224422523
Z6105 94980188348 63002863712 63002863712
Z6107 14446393827 14446393827 72562481096
Z6117 70314446348 52278174988 52278174988
Z6123 59330419958 65729476921 65729476921
;
proc print data=impact;
run;

 

 

 

 

 

 

So if I were in microsoft Excel I'd have a table similar to this: 

PATIENT_IDvisit_number_a visit_number_bvisit_number_cIF_A_equals_BIF_A_equals_B_equals_C
Z6049249424006568596081571485960815714FALSEFALSE
Z6052494151475358141395379881413953798FALSEFALSE
Z6053668866945866688669458666886694586TRUETRUE
Z6054768756358087687563580878586120861TRUEFALSE
Z6056865383439032006037908320060379083FALSEFALSE
Z6095184669925481208865296612088652966FALSEFALSE
Z6100322244225233222442252332224422523TRUETRUE
Z6105949801883486300286371263002863712FALSEFALSE
Z6107144463938271444639382772562481096TRUEFALSE
Z6117703144463485227817498852278174988FALSEFALSE
Z6123593304199586572947692165729476921FALSEFALSE

 

In SAS I want to find out the number of observations where a = b and where a = b = c 

Number of observations where a = b : 4

Number of observations where a = b = c : 2

 

And generate a table of the observations where a = b:

PATIENT_IDvisit_number_a visit_number_bvisit_number_c
Z6053668866945866688669458666886694586
Z6054768756358087687563580878586120861
Z6100322244225233222442252332224422523
Z6107144463938271444639382772562481096

 

 

And generate a table of the observations where a = b = c:

PATIENT_IDvisit_number_a visit_number_bvisit_number_c
Z6053668866945866688669458666886694586
Z6100322244225233222442252332224422523

 

ballardw
Super User

Thank you for a nice clean example.

 

This creates two data sets as your show:

data A_equal_B (drop=A_B A_B_C )
     A_equal_B_C  (drop=A_B A_B_C )
;
   set impact;
   /* add the true/false*/
   A_B   = (visit_number_a =visit_number_b);
   A_B_C = (visit_number_a =visit_number_b = visit_number_c);
   if A_B then output A_equal_B;
   if A_B_C then output A_equal_B_C;

run;

Key bits: SAS returns numeric 1/0 for True/False for comparisons. So placing a comparison such as (visit_number_a =visit_number_b) in the () returns a value of 1 when true or 0 when false. SAS will allow multiple comparisons (if you are careful) at one time, equal is easy.

The explicit Output statement writes to the data set when the value is true.

 

Not sure exactly which you meant by table, so created data sets.

Alternate you could use the following to create one data set with the additional variables and then select which ones for printing (or other uses, the Where statement can be used will most procedures)

data want;
   set impact;
   /* add the true/false*/
   A_B   = (visit_number_a =visit_number_b);
   A_B_C = (visit_number_a =visit_number_b = visit_number_c);
run;

proc print data=want noobs;
   where A_B;
   Title "A equals B";
   var patient_id  visit_number: ;
run; title;

proc print data=want noobs;
   where A_B_C;
   Title "A equals B equals C";
   var patient_id  visit_number: ;
run; title;

The Where statement can use any expression that returns a "true/false" result. SAS will actually use most numbers except 0 as "true" so be careful that you know what you mean.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 500 views
  • 0 likes
  • 2 in conversation