BookmarkSubscribeRSS Feed
papayan_ah
Calcite | Level 5

Hi, 

I am one step away from novice to SAS, and I am stuck coding a specific variable. There are around 300 study participant IDs.  

 

I am looking at rejection biopsy grades at 52, 104, 156, 208, and 260 weeks (tt_bx#) after surgery with an interval of +/- 13.035 weeks. Each ID has a maximum of 10 biopsies (bx_1, bx_2, bx_3, bx_4, etc.), and I have calculated the time to each biopsy from surgery. 

 

The issue is some ID's have multiple biopsies that are within the 52 +/- 13.035wks interval, and subsequent time frames.  How do I code SAS to choose the bx_# that is closest to the central value within the time frame? 

 

Here is the code I have so far, which works only if there was only one biopsy grade per 52 weeks. I am trying to code the new variable acr_1yr in this example. 

data work.cleaning;
set work.cleaning;
**Biopsy 1 Year***
**ACR**;
if tt_bx1>=39.108 and tt_bx1<=65.178 then acr_1yr=bx1_acr;
if tt_bx2>=39.108 and tt_bx2<=65.178 then acr_1yr=bx2_acr;
if tt_bx3>=39.108 and tt_bx3<=65.178 then acr_1yr=bx3_acr;
if tt_bx4>=39.108 and tt_bx4<=65.178 then acr_1yr=bx4_acr;
if tt_bx5>=39.108 and tt_bx5<=65.178 then acr_1yr=bx5_acr;
if tt_bx6>=39.108 and tt_bx6<=65.178 then acr_1yr=bx6_acr;
if tt_bx7>=39.108 and tt_bx7<=65.178 then acr_1yr=bx7_acr;
if tt_bx8>=39.108 and tt_bx8<=65.178 then acr_1yr=bx8_acr;
if tt_bx9>=39.108 and tt_bx9<=65.178 then acr_1yr=bx9_acr;
if tt_bx10>=39.108 and tt_bx10<=65.178 then acr_1yr=bx10_acr;
run;

Any help would be appreciated.

2 REPLIES 2
mkeintz
PROC Star

I've renamed your variables to make it easier to list them.  In the absence of sample data, this program is untested:

 

data want (drop=i _:);
  set have;
  array  tt {10} tt_bx1  - tt_bx10;
  array  bx {10} bx_acr1 - bx_acr10;

  acr_yr {5}  acr_yr1-acr_yr5;

  do _year=1 to 5;
    _target = 52 * _year;
    _mindist = .;
    do i=1 to n(of tt{*});
      if -13.035<=(tt{i}-_target)<=13.035 then do;
        _distance=abs(tt{i}-_target);
        if _distance=min(_distance,_mindist) then do;
          _mindist=_distance;
          acr_yr{_year}=bx{i};
        end;
      end;
    end;
  end;
run;

 

This program assumes that no tt_bx value is non-missing unless the tt_bx value to its left is non-missing (i.e.  tt_bx4 is non-missing only if tt_bx3 is non-missing).  The set of tt_bx variables are given values starting with tt_bx1, then tt_bx2, etc.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PaigeMiller
Diamond | Level 26

I have to admit, that given your explanation, there seem to be some steps missing in my mind in order to make sense of your code.

 

Nevertheless, it seems to me that you can use PROC STDIZE to subtract the mean from each variable. In the resulting data set, the reading closest to zero in absolute value is the one you want (I think).

--
Paige Miller

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 290 views
  • 1 like
  • 3 in conversation