BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mariko5797
Pyrite | Level 9

I am relatively new to using DO loops, so I appreciate any feedback you can provide.

 

I essentially want to store the first day within a period and the last day within a period in their own separate variables. So each period has their own variable for the start and end date/time. From a different post, it seemed the most reliable way to do this was to subset for each period and then merge back together.

 

Data:

mariko5797_0-1633726862740.png

mariko5797_1-1633726875240.png

My attempt:

/*convert characters to dates*/
data expose_;
 set expose_;
 stdtc = input(substr(agstdtc,1,16), e8601dt19.);
 endtc = input(substr(agendtc,1,16), e8601dt19.);
 format stdtc endtc datetime.;
 drop agstdtc agendtc;
run;

proc sort data = expose_; by usubjid stdtc endtc; run;

/*create new variables*/ 
data expose;
 set expose_;
 label 	mk01sdt = 'Date of First Exposure to Mock' mk01stm = 'Time of First Exposure to Mock' mk01sdtm = 'Datetime of First Exposure to Mock' 
 		mk01edt = 'Date of Last Exposure to Mock' mk01etm = 'Time of Last Exposure to Mock' mk01edtm = 'Datetime of Last Exposure to Mock'
		trtsdt = 'Date of First Exsposure to Treatment' trtstm = 'Time of First Exposure to Treatment' trtsdtm = 'Datetime of First Exposure to Treatment'
		trtedt = 'Date of Last Exposure to Treatment' trtetm = 'Time of Last Exposure to Treatment' trtedtm = 'Datetime of Last Exposure to Treatment'
		tr01sdt = 'Date of First Exposure in Period 1' tr01stm = 'Time of First Exposure in Period 1' tr01sdtm = 'Datetime of First Exposure in Period 1'
		tr01edt = 'Date of Last Exposure in Period 1' tr01etm = 'Time of Last Exposure in Period 1' tr01edtm = 'Datetime of Last Exposure in Period 1'
		tr02sdt = 'Date of First Exposure in Period 2' tr02stm = 'Time of First Exposure in Period 1' tr02sdtm = 'Datetime of First Exposure in Period 1'
		tr02edt = 'Date of Last Exposure in Period 2' tr02etm = 'Time of Last Exposure in Period 1' tr02edtm = 'Datetime of Last Exposure in Period 1'
		tr03sdt = 'Date of First Exposure in Period 3' tr03stm = 'Time of First Exposure in Period 1' tr03sdtm = 'Datetime of First Exposure in Period 1'
		tr03edt = 'Date of Last Exposure in Period 3' tr03etm = 'Time of Last Exposure in Period 1' tr03edtm = 'Datetime of Last Exposure in Period 1'
		tr04sdt = 'Date of First Exposure in Period 4' tr04stm = 'Time of First Exposure in Period 1' tr04sdtm = 'Datetime of First Exposure in Period 1'
		tr04edt = 'Date of Last Exposure in Period 4' tr04etm = 'Time of Last Exposure in Period 1' tr04edtm = 'Datetime of Last Exposure in Period 1';

 /*overall first and last treatment*/
 by usubjid;
 if first.usubjid then do; 
	trtsdt = datepart(stdtc); trtstm = timepart(stdtc); trtsdtm = stdtc;
 end;
 else if last.usubjid then do;
	trtedt = datepart(endtc); trtetm = timepart(endtc); trtedtm = endtc;
 end;
run; 

/*subset treatment periods*/
%let periods=MOCK*1*2*3*4;
%macro loopdatetime_();
	%do i = 1 %to 5;
		%let No = %scan(&periods, &i);
		data period&No;
		 set expose_;
		 by usubjid;
		 %if &No = MOCK %then %do;
			where findw(visit, 'MOCK') > 0;
			if first.usubjid then do; mk01sdt = datepart(stdtc); mk01stm = timepart(stdtc); mk01sdtm = stdtc; end;
				else if last.usubjid then do; mk01edt = datepart(endtc); mk01etm = timepart(endtc); mk01edtm = endtc; end;
		 %end;
		 %else %do;
			where findw(visit, "PER&No") > 0;  
		 	if first.usubjid then do; tr0&No.sdt = datepart(stdtc); tr0&No.stm = timepart(stdtc); tr0&No.sdtm = stdtc; end;
				else if last.usubjid then do; tr0&No.edt = datepart(endtc); tr0&No.etm = timepart(endtc); tr0&No.edtm = endtc; end;
		 %end;
		run;

		data expose;
		 merge expose period&No;
		run;
	%end;
%mend loopdatetime_;
%loopdatetime_;

proc sort data = expose; by usubjid visit; run;

It doesn't seem to be filtering properly. For example, even though I specify

		 %if &No = MOCK %then %do;
			where findw(visit, 'MOCK') > 0;
			if first.usubjid then do; mk01sdt = datepart(stdtc); mk01stm = timepart(stdtc); mk01sdtm = stdtc; end;
				else if last.usubjid then do; mk01edt = datepart(endtc); mk01etm = timepart(endtc); mk01edtm = endtc; end;
		 %end;

It seems to keep VISIT that contains PER4 DAY #. This occurs for all start dates defined in the macro. Furthermore, none of the end dates are recorded. Is this due to a misuse of the MERGE function? 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Here's my approach to this. "First" of date, time or datetime is a minimum, "Last" is a maximum within any sort of grouping.

So :

data expose_recode;
    set expose_;
    stdtc = input(substr(agstdtc,1,16), e8601dt19.);
    endtc = input(substr(agendtc,1,16), e8601dt19.);
    format stdtc endtc datetime.;
    drop agstdtc agendtc;
    if visit='MOCK' then period=0;
    else Period=input(substr(visit,4,1),f1.);
run;

proc summary data=expose_recode nway;
   class usubjid period;
   var stdtc endtc;
   output out=expos_summary min(stdtc)=trsdt max(endtc)=tredt;
run;

/* if your really need different variables, questionable for date and time*/
data intermediate;
   set expos_summary;
   trsd  = datepart(trsdt);
   trstm = timepart(trsdt);
   tred  = datepart(tredt);
   tretm = timepart(tredt);
run;

For about 90+ percent of uses making that "wide" format with all the different variables is harder to work with and placing a sequence in the middle such as TR01DT makes things more verbose. If you have sequences then place the number at the end. Then at least you can reference the variables with a sequence list like TRDT01 - TRDT05 (or what have you).

 

If you want to see "Mock" instead of 0 then create custom format for the Periods.

 

Warning: Habitual use of code like this will cause problems.

data expose_;
 set expose_;

That completely replaces the data set Expose_. So you can't run the code a second time. Why, the DROP will have errors about variable not found. If you make a minor logic error you can accidentally change values you don't intend to and you have to go back to your "read the data" step to recover the values. Even worse(more subtle) is when you recode a variable back into the same variable. If you have to rerun the data step you can keep recoding the same variable and may not notice that it is a problem until very much later. Like when the boss looks at a result and says something like "This is very unlikely (or just plain impossible) because these values are supposed to be in range x to y and yours are in p to q."

 

Hint: Most "subset and loop over" data set questions can be resolved by suitable use of BY groups somewhere.

 

View solution in original post

2 REPLIES 2
data_null__
Jade | Level 19

Please supply example data as TEXT.

 

Did you consider creating a new variable PERIOD from VISIT?  I think with PERIOD it would simple to find min(start) max(end) by USUBJID PERIOD;  Then transpose, BY USUBJID; ID PERIOD,for the TRppDTC variables.

 

ballardw
Super User

Here's my approach to this. "First" of date, time or datetime is a minimum, "Last" is a maximum within any sort of grouping.

So :

data expose_recode;
    set expose_;
    stdtc = input(substr(agstdtc,1,16), e8601dt19.);
    endtc = input(substr(agendtc,1,16), e8601dt19.);
    format stdtc endtc datetime.;
    drop agstdtc agendtc;
    if visit='MOCK' then period=0;
    else Period=input(substr(visit,4,1),f1.);
run;

proc summary data=expose_recode nway;
   class usubjid period;
   var stdtc endtc;
   output out=expos_summary min(stdtc)=trsdt max(endtc)=tredt;
run;

/* if your really need different variables, questionable for date and time*/
data intermediate;
   set expos_summary;
   trsd  = datepart(trsdt);
   trstm = timepart(trsdt);
   tred  = datepart(tredt);
   tretm = timepart(tredt);
run;

For about 90+ percent of uses making that "wide" format with all the different variables is harder to work with and placing a sequence in the middle such as TR01DT makes things more verbose. If you have sequences then place the number at the end. Then at least you can reference the variables with a sequence list like TRDT01 - TRDT05 (or what have you).

 

If you want to see "Mock" instead of 0 then create custom format for the Periods.

 

Warning: Habitual use of code like this will cause problems.

data expose_;
 set expose_;

That completely replaces the data set Expose_. So you can't run the code a second time. Why, the DROP will have errors about variable not found. If you make a minor logic error you can accidentally change values you don't intend to and you have to go back to your "read the data" step to recover the values. Even worse(more subtle) is when you recode a variable back into the same variable. If you have to rerun the data step you can keep recoding the same variable and may not notice that it is a problem until very much later. Like when the boss looks at a result and says something like "This is very unlikely (or just plain impossible) because these values are supposed to be in range x to y and yours are in p to q."

 

Hint: Most "subset and loop over" data set questions can be resolved by suitable use of BY groups somewhere.

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 1117 views
  • 0 likes
  • 3 in conversation