BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
avalancheshane
Calcite | Level 5

I'm pretty much a beginner at SAS so I'm hoping this is the right place for this question and I'll do my best to explain what I'm trying to do.

I have a large dataset (5,582 rows) of patient data from a study involving depression treatment. The data is artificially generated for training purposes. A score from an assessment was collected once per week for a total of 9 weeks. The way the data is currently formatted looks like this:

Trt                  ID            Sex                 Race         Week               Score

RxA          1003               F               Caucasian       1                           .

RxB          1004               M               Caucasian      1                          .

RxA          1007               F               Caucasian       1                         30

RxA          1007               F               Caucasian       2                         31

RxA          1007               F               Caucasian       3                         36

RxA          1007               F               Caucasian       4                         36

RxA          1007               F               Caucasian       5                         35

RxA          1007               F               Caucasian       6                         15

RxA          1007               F               Caucasian       7                         12

RxA          1007               F               Caucasian       8                         10

RxA          1007               F               Caucasian       9                          7

RxB          1008               M              Caucasia         1                          .

I'm trying to create a variable defined as the time to a response of interest. In this case I want to create a variable that will tell me when each patient has reached a score of 1/2 or less than the score they had at week 1. However I am a bit confused as to how to do this since each patient has a row of data for each week in the study. I created another variable called Responder which is assigned a value of 1 when a subject's score in a given week is 1/2 or less of their week 1 score.

Using that I tried writing the following:

select;

when (Responder=1) responsetime=week;

otherwise;

end;

This variable was assigned a value for EACH week that a subject's score was 1/2 or less of the week 1 score. However, what I am attempting to do is to create a variable that will only have ONE value for each patient and that will be equal to the first week that the subject meets the criteria of <= 1/2 of the week 1 score. In other words, I don't care if a patient was at 1/2 or below the week 1 score on the 5th, 7th, and 8th weeks; I'm only interested in the fact that the subject met the criteria on week 5 and I would like the responsetime variable to be equal to 5 for that patient.

I'm not sure how best to proceed and whether this issue would be best solved by trying to manipulate my data to wind up with one row for each patient or if there is a way to do that while producing this variable.

Ideally I'd like the end result to look like

Trt          ID          Response Time

RxA     1003               N/A

RxB     1004               N/A

RxA     1007                6

RxB     1008               N/A

etc.

If it wouldn't be too difficult I'd also be interesting in knowing how to find the same response time but using subjects who didn't have a score at Week 1 but did starting on Week 2!

Any help would be greatly appreciated!    

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Since you've already created the variable, in a file called "have", and if there is a maximum number of weeks one could have (in my example less than 99), then you COULD use something like:

data have;

  informat trt $3.;

  informat sex $1.;

  informat race $9.;

  input Trt ID Sex Race Week Score ResponseTime;

  cards;

RxA 1003   F   Caucasian       1    . .

RxB 1004   M   Caucasian       1    . .

RxA 1007   F   Caucasian       1   30 .

RxA 1007   F   Caucasian       2   31 .

RxA 1007   F   Caucasian       3   36 .

RxA 1007   F   Caucasian       4   36 .

RxA 1007   F   Caucasian       5   35 .

RxA 1007   F   Caucasian       6   15 6

RxA 1007   F   Caucasian       7   12 7

RxA 1007   F   Caucasian       8   10 8

RxA 1007   F   Caucasian       9    7 9

RxB 1008   M   Caucasian       1    . .

;

proc format;

  value rt

  99="N/A"

;

run;

data want;

  set have;

  format ResponseTime rt.;

  if missing(ResponseTime) then ResponseTime=99;

run;

proc sort data=want;

  by id trt ResponseTime;

run;

proc sort data=want nodupkey;

  by id trt;

run;

I should point out that using values like 99 to represent missing can be dangerous, as others who also have to access your code might not know that the value only represents missing .. not 99 weeks, but it can definitely simplify selecting the records you want in this case.

Art

View solution in original post

2 REPLIES 2
art297
Opal | Level 21

Since you've already created the variable, in a file called "have", and if there is a maximum number of weeks one could have (in my example less than 99), then you COULD use something like:

data have;

  informat trt $3.;

  informat sex $1.;

  informat race $9.;

  input Trt ID Sex Race Week Score ResponseTime;

  cards;

RxA 1003   F   Caucasian       1    . .

RxB 1004   M   Caucasian       1    . .

RxA 1007   F   Caucasian       1   30 .

RxA 1007   F   Caucasian       2   31 .

RxA 1007   F   Caucasian       3   36 .

RxA 1007   F   Caucasian       4   36 .

RxA 1007   F   Caucasian       5   35 .

RxA 1007   F   Caucasian       6   15 6

RxA 1007   F   Caucasian       7   12 7

RxA 1007   F   Caucasian       8   10 8

RxA 1007   F   Caucasian       9    7 9

RxB 1008   M   Caucasian       1    . .

;

proc format;

  value rt

  99="N/A"

;

run;

data want;

  set have;

  format ResponseTime rt.;

  if missing(ResponseTime) then ResponseTime=99;

run;

proc sort data=want;

  by id trt ResponseTime;

run;

proc sort data=want nodupkey;

  by id trt;

run;

I should point out that using values like 99 to represent missing can be dangerous, as others who also have to access your code might not know that the value only represents missing .. not 99 weeks, but it can definitely simplify selecting the records you want in this case.

Art

avalancheshane
Calcite | Level 5

Thank you very much! This was exactly what I was looking for! I was able to adapt this code to apply to my entire data set and now have the entries merged in the fashion I was looking for! Thank you again!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 620 views
  • 0 likes
  • 2 in conversation