Help using Base SAS procedures

SAS Program to De-Duplicate with a Condition

Reply
Contributor
Posts: 32

SAS Program to De-Duplicate with a Condition

I need to create a dataset on which there are two systolic blood pressure measurements taken from the same individual. I need to take only the lowest of the two measurements only if the measurements are taken within 4 weeks from each other. If the measurements are not within four weeks then the measurement is not taken. The names and date of birth are not real.

ID                            Name                                 Date of Birth                          Systolic Blood Pressure                   Date of measurement

0001                      abc                                       12/12/1965                               145                                                     8/10/2010

0002                      abc                                       12/12/1965                               139                                                     8/20/2010

0003                      def                                        8/30/1962                                 132                                                    8/04/2010

0004                      def                                       8/30/1962                                  137                                                    8/21/2010

0005                      ghi                                       3/12/1964                                  140                                                   02/07/2010

0006                      ghi                                       3/12/1964                                  129                                                   03/17/2010

007                        jkl                                        5/10/1963                                  127                                                  01/07/2010

008                        jkl                                       5/10/1963                                   121                                                  02/13/2010     

As can be seen, abc, def, ghi and jkl are four different persons. Each of them have two readings of systolic blood pressure. I need to write SAS program to take the lowest of the two measurements, only if the date of measurements are within four weeks from each other. If not, then they are not included. In this case the first two are seen to meet that criteria of four weeks while the last two does not.

Any help will be appreciated. Thanks in advance.

Super User
Posts: 10,538

Re: SAS Program to De-Duplicate with a Condition

This may get you started:

proc summary data=have nway;

class  name date_of_birth;

var Systolic_Blood_Pressure;

output out=temp (drop=_type_ _freq_) min=

    idgroup (min(Systolic_Blood_Pressure) out[2]

             (Systolic_Blood_Pressure id Date_of_measurement)=)

             /autolabel autoname;

run;

This gets the two lowest values.

Since you don't guarantee there are two or more readings the deduplication may be a bit more.

You will have two variable in the output set above for each date of measurement.

It could be:

data want;

     set temp;

     where Date_of_measurement_1 - Date_of_measurement2 le 28;

run;

Contributor
Posts: 32

Re: SAS Program to De-Duplicate with a Condition

Thank you. Your code did help in getting started but I still need to get the desired outputs which are 1) first get the two measurements per person that are 12 weeks apart and then 2) get rid of the one that is the highest (and keep the lowest). I am playing around with this code though to get to the outputs.

Super User
Super User
Posts: 7,417

Re: SAS Program to De-Duplicate with a Condition

Well, depends on your definitions, does it mean that the two observations are exactly 12 weeks apart?  Will there only be 2 results, or many results, what happens if more than 2 results overlapping etc.  Clearly define the logic needed first.  If its just two and they are exact:

proc sql;

     create table WANT as

     select     COALESCE(A.ID,B.ID) as ID,

                    ...

                    case     when A.SYSBP is not null and A.SYSBP < B.SYSBP then A.SYSBP

                                   else B.SYSBP end as SYSBP

     from        HAVE A

     full join    HAVE B

     on           A.ID=B.ID

     and          A.DOB=B.DOB

     and           A.NAME=B.NAME

     and          ...

     and          B.MEAS=intnx('week',A.MEAS,12,'same');

quit;

Contributor
Posts: 32

Re: SAS Program to De-Duplicate with a Condition

Thank you. It is 4 weeks and not 12 weeks. But they need not be exactly 4 weeks apart. The dates could be within 4 weeks. There will be only two results, and not multiple. Your code is helpful if it needed an exact time difference (4 weeks in this case). thanks again.

Super User
Posts: 10,538

Re: SAS Program to De-Duplicate with a Condition

12 week? earlier it was 4 weeks...

The data set I show has 2 minimum values and the associated dates. I thought you might be able to determine which ones were of interest by comparing the dates.

I think you are possibly hung up on "deduplicate first" then find min as the order to accomplish the task. Sometimes the order may work better in a different sequence.

Contributor
Posts: 32

Re: SAS Program to De-Duplicate with a Condition

Thank you. It is 4 weeks. Sorry for the typo. I will try the different sequence as you suggested.

Super User
Posts: 9,687

Re: SAS Program to De-Duplicate with a Condition

You need to post the output you need according to your sample data. Otherwise We just are guessing what you want.


data have;
input ID                   $         Name    $      DateofBirth : mmddyy12.                       SystolicBloodPressure                   Dateofmeasurement : mmddyy12.;
format DateofBirth        Dateofmeasurement     mmddyy10.;
cards;
0001                      abc                                       12/12/1965                               145                                                     8/10/2010
0002                      abc                                       12/12/1965                               139                                                     8/20/2010
0003                      def                                        8/30/1962                                 132                                                    8/04/2010
0004                      def                                       8/30/1962                                  137                                                    8/21/2010
0005                      ghi                                       3/12/1964                                  140                                                   02/07/2010
0006                      ghi                                       3/12/1964                                  129                                                   03/17/2010
007                        jkl                                        5/10/1963                                  127                                                  01/07/2010
008                        jkl                                       5/10/1963                                   121                                                  02/13/2010     
;
run;
proc sql;
 create table want as
  select * 
   from have
    group by Name
      having range(Dateofmeasurement) le 28 and SystolicBloodPressure=min(SystolicBloodPressure);
quit;

Xia Keshan

Contributor
Posts: 32

Re: SAS Program to De-Duplicate with a Condition

Thank you much. Here is the desired final output:

id                 name             DateofBirth                             SystolicBloodPressire            DateofMeasurement

0001            abc                  12/12/1965                                       139                                  8/20/2010

0003            def                   8/30/1962                                         132                                  8/04/2010

To get this output there are two steps I guess. The first step will remove all the duplicates that have blood pressure measurements that are beyond 4 weeks apart from each other and keep those duplicates' measurements that are

within 4 weeks.

The second step would then give the above output where only the lowest of the measurement is kept.

Your SQL code will get me the second step i.e. the lowest of the two duplicate's measurement, but may not get me the first desired output which is list of duplicates that have measurement within 4 weeks and that are beyond four weeks. 

Super User
Posts: 9,687

Re: SAS Program to De-Duplicate with a Condition

No. I can get you the first desired output  . The question is what output you are looking for ?

proc sql;

create table want as

  select *

   from have

    group by Name

      having range(Dateofmeasurement) le 28 ;

quit;

Contributor
Posts: 32

Re: SAS Program to De-Duplicate with a Condition

Thank you very much. This was helpful.

Super User
Posts: 10,538

Re: SAS Program to De-Duplicate with a Condition

Suppose you have a patient with 16 readings all exactly 3 weeks apart. Which are kept by your deduplication rule?

Also you did not specify that your output requirement was a list of duplicates. Plus you need to provide a much more comprehensive example to show what you mean by

duplicates that have measurement within 4 weeks and that are beyond four weeks.

because that at face value is a null set.

Ask a Question
Discussion stats
  • 11 replies
  • 356 views
  • 11 likes
  • 4 in conversation