BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
alpine_nights
Fluorite | Level 6

Hi all,

 

I need help formatting my data for Proc Traj. I have steroid data during a 1 year period, and I have dose information as well as the amount of steroid supplied (days_supp). I am trying to get my dataset clean so that every month is recorded properly, however my data is fairly messy. This is a small example of what I am working with and what I would like it to be:

data have;
input id month dose days_supp;
datalines;
1 1 1 90
1 2 5 90
1 3 . .
1 4 . .
1 5 4 30
1 6 6 90
1 7 . .
1 8 . .
1 9 . .
1 10 2 30
1 11 3 30
1 12 . .
2 1 . .
2 2 5 60
2 3 2 30
2 4 . .
2 5 . .
2 6 . .
2 7 5 30
2 8 . .
2 9 5 30
2 10 5 30
2 11 5 30
2 12 5 30
;
run;

data want;
input id month dose days_supp;
datalines;
1 1 1 30
1 2 1 30
1 3 1 30
1 4 5 30
1 5 5 30
1 6 5 30
1 7 4 30
1 8 6 30
1 9 6 30
1 10 6 30
1 11 2 30
1 12 3 30
2 1 . .
2 2 5 30
2 3 5 30
2 4 2 30
2 5 . .
2 6 . .
2 7 5 30
2 8 . .
2 9 5 30
2 10 5 30
2 11 5 30
2 12 5 30
;
run;

 

Basically, if someone got a 90 day supply on month 1, that dose should be present for month 1, 2, and 3. 

 

 

This is what I have tried so far:

 

data month_lag;
	set un_t_month;
	dose_lag=lag(avg_dose);
	supp_lag=lag(days_supp);
	dose_lag2=lag(dose_lag);
	supp_lag2=lag(supp_lag);
	dose_lag3=lag(dose_lag2);
	supp_lag3=lag(supp_lag2);
	if 35<supp_lag<61 and _name_ ^= "dose12" and avg_dose=. then do;
		avg_dose=dose_lag;
		days_supp=supp_lag-30;
		avg_dose=(avg_dose*days_supp)/30;
	end;
	else if 61<supp_lag<91 and _name_ not in ("dose12","dose11") and avg_dose=. then do;
		avg_dose=dose_lag;
		days_supp=supp_lag-60;
		avg_dose=(avg_dose*days_supp)/30;
	end;
	else if 61<supp_lag2<91 and _name_ not in ("dose12", "dose11") and avg_dose=. then do;
		avg_dose=dose_lag2;
		days_supp=supp_lag2-60;
		avg_dose=(avg_dose*days_supp)/30;
	end;
run;

But this doesn't work when I have multiple large days supplies next to each other (i.e. 90 on month 1 and 90 on month 2). Also it's very inefficient code I'm sure.

 

Any help would be appreciated. I also have the data transposed in case it's better to approach it that way.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PhilC
Rhodochrosite | Level 12

SAS programmers call this a 'Do "Whitlock" loop'.

 

data want (Keep=id month dose days_supp);
  if 0 then set have;
  month=1;
  do until (last.id);
    set have (rename=(days_supp=old_days_supp month=old_month));
      by id;
    if old_days_supp =. 
      then old_days_supp =0;
    do i=old_days_supp to 30 by -30;
      days_supp=30;
      output;
      month=month+1;
    end; 
    if old_month>=month then do;
      month=old_month;
      days_supp=.;
      output;
      month+1;
    end;
  end;
run;

 

View solution in original post

7 REPLIES 7
ballardw
Super User

From what I see with the "pattern" of missing data it looks to me as if something read a file incorrectly and moved lines up for your "dose" and "days" information.

Can you share some lines of the source data with any personal identification  information removed and share how your read that source?

alpine_nights
Fluorite | Level 6

This is very messy medicaid data. The individuals have several steroid use observations at different dates, and I used this code to create months based on a specific one year time interval:

data month_data;
	set steroid_cohort;
	if steroid_dt^=.;
	if 0<=steroid_dt-index_dt<=30 then month=1;
	else if 30<steroid_dt-index_dt<=60 then month=2;
	else if 60<steroid_dt-index_dt<=90 then month=3;
	else if 90<steroid_dt-index_dt<=120 then month=4;
	else if 120<steroid_dt-index_dt<=150 then month=5;
	else if 150<steroid_dt-index_dt<=180 then month=6;
	else if 180<steroid_dt-index_dt<=210 then month=7;
	else if 210<steroid_dt-index_dt<=240 then month=8;
	else if 240<steroid_dt-index_dt<=270 then month=9;
	else if 270<steroid_dt-index_dt<=300 then month=10;
	else if 300<steroid_dt-index_dt<=330 then month=11;
	else if 330<steroid_dt-index_dt<=365 then month=12;
	datediff=steroid_dt-index_dt;
run; 

With steroid_dt being the date of the visit where steroids were prescribed, and index date being the diagnosis date of the disease I am looking at.

 

Here is an example of the data I started with for the first individual. Month is the variable I created

alpine_nights_0-1608238662043.png

 

ballardw
Super User

Glad you get a working solution.

 

Warning: Leap days can potentially create a value with no "month" at all.

 

Since you are calculating Datediff, why not do it before all that "if then else" and use it instead of repeatedly typing the same subtraction?

alpine_nights
Fluorite | Level 6

Ah yes, datediff was something I added after the fact for checking purposes, that's a great idea thank you!

PhilC
Rhodochrosite | Level 12

SAS programmers call this a 'Do "Whitlock" loop'.

 

data want (Keep=id month dose days_supp);
  if 0 then set have;
  month=1;
  do until (last.id);
    set have (rename=(days_supp=old_days_supp month=old_month));
      by id;
    if old_days_supp =. 
      then old_days_supp =0;
    do i=old_days_supp to 30 by -30;
      days_supp=30;
      output;
      month=month+1;
    end; 
    if old_month>=month then do;
      month=old_month;
      days_supp=.;
      output;
      month+1;
    end;
  end;
run;

 

PhilC
Rhodochrosite | Level 12

found a bug, and edited my code above.

alpine_nights
Fluorite | Level 6

This is perfect, thank you very much!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 970 views
  • 2 likes
  • 3 in conversation