BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cj3
Fluorite | Level 6 cj3
Fluorite | Level 6

Hello Community,

 

I am working on a data analysis problem for which I would greatly appreciate some programming help. Basically, I need to create a variable for each subject in my dataset that indicates whether (yes/no) they reported drug use prior to the study randomization date (i.e., baseline drug use).  Below is what my dataset looks like as well as the desired output. Any feedback would be much appreciated! Please let me know if any additional information or clarification would be helpful.

 

Have:

ID

Randomization date

Week start date

D1 date

D2 date

D3 date

D4 date

D5 date

D6 date

D7 date

D1 drug use

D2 drug use

D3 drug use

D4 drug use

D5 drug use

D6 drug use

D7 drug use

01

04/02/18

02/25/18

.

.

.

02/28/18

03/01/18

03/02/18

03/03/18

.

.

.

0

0

1

0

01

04/02/18

03/04/18

03/04/18

03/05/18

03/06/18

03/07/18

03/08/18

03/09/18

03/10/18

0

0

0

0

0

1

0

01

04/02/18

03/11/18

03/11/18

03/12/18

03/13/18

03/14/18

03/15/18

03/16/18

03/17/18

0

1

0

0

0

0

0

01

04/02/18

03/18/18

03/18/18

03/19/18

03/20/18

03/21/18

03/22/18

03/23/18

03/24/18

0

0

0

0

0

0

0

01

04/02/18

03/25/18

03/25/18

03/26/18

03/27/18

03/28/18

03/29/18

03/30/18

03/31/18

0

0

1

0

0

1

0

01

04/02/18

04/01/18

04/01/18

04/02/18

04/03/18

04/04/18

04/05/18

04/06/18

04/07/18

0

0

0

0

0

0

0

01

04/02/18

04/08/18

04/08/18

04/09/18

04/10/18

04/11/18

04/12/18

04/13/18

04/14/18

0

0

0

0

0

0

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Note: The data includes 30 days prior to each subject’s randomization date; hence, the missing data for some cells. My dataset also includes 6 months of data after the randomization date; however, these data are irrelevant to my analysis question RE: baseline drug use.

 

Want:

ID

Baseline drug use

01

1

02

0

03

1

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

This is untested code. If you want code that I have tested, you need to provide sample data as a SAS data step.

 

data have2;
    set have;
    array drug_use d1_drug_use d2_drug_use ... ; /* You type the full list of variable names */
    array dates d1_date d2_date ... ;
    baseline_use_this_week=0;
    do i=1 to dim(drug_use);
         if drug_use(i)=1 and dates(i)<randomization_date then 
             baseline_use_this_week=1;
    end;
run;
proc summary data=have2 nway;
    class id; 
    var baseline_use_this_week;
    output out=want max=baseline_drug_use;
run;
--
Paige Miller

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

Basically, I need to create a variable for each subject in my dataset that indicates whether (yes/no) they reported drug use prior to the study randomization date (i.e., baseline drug use).

 

Is subject the variable named ID? How do we know from the table you show if (yes/no) they reported drug use? How do we know it was prior to the study randomization date? What is "baseline drug use"?

--
Paige Miller
cj3
Fluorite | Level 6 cj3
Fluorite | Level 6

Hi @PaigeMiller Yes, the variable name "ID" indicates the subject.

 

Baseline drug use is defined by any ( Yes vs. No) drug before the randomization date. 

 

Drug use is indicated by a "1" in the "D1....D7 drug use" columns. Each row of data for each participant represents an entire week. The "D1 drug use" column indicates drug use for the "D1 date" column, the "D2 drug use" column indicates drug use for the "D2 date" column, and so on.

 

To know whether drug use occurred before the randomization date, you then would check each date prior to the the date listed in for the "Randomization date" variable. In this case, the randomization date is 04/02/18. Therefore, I need to determine drug use on 04/01/18 and before. In this example, drug use occurred on 03/02/18, 03/09/18, 03/12/18, 03/28/18, and 03/30/18. So, the baseline drug use variable would be "1". 

 

Please let me know if you have any other questions. Thank you for your help! 

PaigeMiller
Diamond | Level 26

This is untested code. If you want code that I have tested, you need to provide sample data as a SAS data step.

 

data have2;
    set have;
    array drug_use d1_drug_use d2_drug_use ... ; /* You type the full list of variable names */
    array dates d1_date d2_date ... ;
    baseline_use_this_week=0;
    do i=1 to dim(drug_use);
         if drug_use(i)=1 and dates(i)<randomization_date then 
             baseline_use_this_week=1;
    end;
run;
proc summary data=have2 nway;
    class id; 
    var baseline_use_this_week;
    output out=want max=baseline_drug_use;
run;
--
Paige Miller

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 512 views
  • 1 like
  • 2 in conversation