Re: Alternative code with easy understading

raja777pharma · Posted 11-30-2021 01:26 PM

Hello ,

I am unable to understand how below code is working.

can any one alternative code will provide with easy understand with same logic.

data ss3_epoch_6;
   set epoch2;
   by seq_id usubjid strvard sestd seend;
   retain outfl;
   if first.seq_id then outfl = 0;
   
   if strvard >= sestd and seend > strvard and sestd ne . and strvard ne . and outfl = 0 then do;
       output;
       outfl = 1;
   end;
   * If strtvar does not fit any epoch, set it to missing and output;
   if last.seq_id and outfl = 0 then do;
   * Fix for ongoing study where the end date for the last epoch is still missing.;
   if seend = . and strvard ne . then output;
   else do;
   EPOCH = "";
   if strvard ne . then PUTLOG "(epoch) STAR_CHECK: Provided date outside all epochs in sdtm.SE for subject: " 
USUBJID= N=;
   output;
  end;

Thank you,

Rajasekhar.

Kurt_Bremser · Posted 11-30-2021 01:47 PM

For every seq_id, the first observation that meets this condition

strvard >= sestd and seend > strvard and sestd ne . and strvard ne . and

is output; if no such observation is found, the last observation is output, with EPOCH set to a missing value if another condition is met.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

ChrisNZ · Posted 11-30-2021 04:23 PM

I don't see how the code could be simplified.

Better formatting (indentation) might help you to understand the logic.

I don't know the data, so can't comment further, but the coder did a good job.

It's your opportunity for growth it seems 🙂

High-Performance SAS Coding - Third Edition

raja777pharma · Posted 12-01-2021 06:07 AM

Hi , Below is example data sorry i am not able to write the data here.

For below code is output two obervations.

data out;
set input;
by seq_id usubjid strvard sestd seend;
retain outfl;
   if first.seq_id then outfl = 0;
   
   if strvard >= sestd and seend > strvard and sestd ne . and strvard ne . and outfl = 0 then do;
       output;
       outfl = 1;
   end;

run;

output;

I wrote the same logic but no observations are output with below code , please help me to understand what i am missing , aslo please help to write different from above.

data output;
 set input;
by seq_id usubjid strvard sestd seend;
 if first.seq_id then do;       
	    	if sestd ^= . and strvard ^= .  then do;
	      		if sestd <= strvard and  strvard < seend then do;
	      		    flag=1;
	       			output;
	      		end;   		 
            end;
       end;
run;

Thank you,

Rajasekhar.

Kurt_Bremser · Posted 12-01-2021 06:51 AM

Post usable data, like tghis:

data have;
input seq_id epoch :$20. sestdtc :yymmdd10.;
format sestdtc yymmdd10.;
datalines;
199 SCREENING 2019-09-24
;

Add additional variables and observations as needed to illustrate your issue.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

raja777pharma · Posted 12-02-2021 02:05 AM

Hi ,

Thank you very much for help.

I have try te get some sammple data here and expalined rules and output.

Plese help me code with differently previosu code.

data ce;
input usubjid :$40 ceterm :$200 cestdtc :yymmdd10.;
format cestdtc yymmdd10.;
datalines;
D169CC00001/E0201004    CV DEATH    14-09-21
D169CC00001/E0201004    HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT    04-02-20
D169CC00001/E0201004    HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT    03-09-21
D169CC00001/E0201004    HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT    04-02-20
D169CC00001/E0201007    HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT    20-05-21
D169CC00001/E0201025    HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT    04-06-20
D169CC00001/E0201025    HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT    04-06-20
;
run;

data se;
input usubjid :$40 EPOCH :$200 SESTDTC :yymmdd10. SEENDTC :yymmdd10.;
format SESTDTC SEENDTC yymmdd10.;
datalines;
USUBJID	EPOCH	SESTDTC	SEENDTC
D169CC00001/E0201004    SCREENING	        07-02-19	12-02-19
D169CC00001/E0201004	BLINDED TREATMENT	12-02-19	15-09-21
D169CC00001/E0201007	SCREENING	        01-04-19	08-04-19
D169CC00001/E0201007	BLINDED TREATMENT	08-04-19	 
D169CC00001/E0205038	SCREENING	        12-09-19	19-09-19
D169CC00001/E0205038	BLINDED TREATMENT	19-09-19	20-02-20
D169CC00001/E0205038	FOLLOW-UP	        20-02-20	 
D169CC00001/E0201025	SCREENING	        26-09-19	03-10-19
D169CC00001/E0201025	BLINDED TREATMENT	03-10-19	08-01-20
D169CC00001/E0201025	FOLLOW-UP	        08-01-20
;
run;

/* From Above two data set are left joined by below code. */
/*before left join created one unique id in CE data */
/*for final output records same as first CE data */

data ce;
  set ce;
  seq_id=_n_;
run;

proc sql;
 create table ce_and_se as
 select a.*,b.epoch,b.sestdtc,b.seendtc
 from ce as a left join se as before on a.usubjid=b.usubjid;
quit;

Output data set as below:
1. Output data set have same observation from first CE data.
2. EPOCH values will be assinged based on date when SESTDTC <= cestdtc < SEENDTC.
3. if cestdtc is missing then epoch will missing values.
4. if  SESTDTC <= cestdtc < SEENDTC this logic not fall then print those values to log (only when cestdtc and SESTDTC and SEENDTC not missing)

/*Final output like below:*/
/*data have */
USUBJID	ceterm	cestdtc	EPOCH
D169CC00001/E0201004	CV DEATH	14-09-21	
D169CC00001/E0201004	HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT	04-02-20	BLINDED TREATMENT
D169CC00001/E0201004	HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT	03-09-21	BLINDED TREATMENT
D169CC00001/E0201004	HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT	04-02-20	BLINDED TREATMENT
D169CC00001/E0201007	HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT	20-05-21	BLINDED TREATMENT
D169CC00001/E0201025	HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT	04-06-20	FOLLOW-UP
D169CC00001/E0201025	HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT	04-06-20	FOLLOW-UP

Thank you.

Rajasekhar.

Kurt_Bremser · Posted 12-02-2021 04:04 AM

Please review your codes.

Make sure that your DATA steps work without ERRORs or invalid data messages in the log.

Are you sure you want to use a YYMMDD informat and not DDMMYY?

Why do you expect those FOLLOW-UP observations to match when their SEENDTC is missing?

Also, ALWAYS (as in ALWAYS) use 4-digit years. After the Y2K scare this should be an unquestioned given in computing.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

raja777pharma · Posted 12-02-2021 05:20 AM

Hi ,

Format of the date variibles would be fine YYMMDD10.

For followup record below condition is met, so assinging the values , if SEENDTC is value then will check the cestdtc < SEENDTC

SESTDTC <= cestdtc

Thank you,

Raja

Kurt_Bremser · Posted 12-02-2021 05:29 AM

This clarifies exactly NOTHING.

Please re-post your codes, tested, with 4-digit years. Make sure to use proper delimiters or the & informat modifier so that the strings with blanks are read correctly. I will not answer further until you do that.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

raja777pharma · Posted 12-02-2021 07:18 AM

Hello ,

Thank you for help.

I have exceuted the below code and data sets were genrated without error.

Also date year is displayed 4 digits in data set.

data ce;
input usubjid $1-21 ceterm $22-78 cestdtc yymmdd10.;
format cestdtc yymmdd10.;
datalines;
D169CC00001/E0201004 CV DEATH                                                 14-09-21
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-20
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 03-09-21
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-20
D169CC00001/E0201007 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 20-05-21
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-20
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-20
;
run;

data se;
input usubjid $1-21 EPOCH $22-39 SESTDTC yyyymmdd10. SEENDTC yyyymmdd10.;
format SESTDTC SEENDTC yyyymmdd10.;
datalines;
D169CC00001/E0201004 SCREENING         07-02-19 12-02-19
D169CC00001/E0201004 BLINDED TREATMENT 12-02-19 15-09-21
D169CC00001/E0201007 SCREENING         01-04-19 08-04-19
D169CC00001/E0201007 BLINDED TREATMENT 08-04-19  
D169CC00001/E0205038 SCREENING         12-09-19 19-09-19
D169CC00001/E0205038 BLINDED TREATMENT 19-09-19 20-02-20
D169CC00001/E0205038 FOLLOW-UP         20-02-20  
D169CC00001/E0201025 SCREENING         26-09-19 03-10-19
D169CC00001/E0201025 BLINDED TREATMENT 03-10-19 08-01-20
D169CC00001/E0201025 FOLLOW-UP         08-01-20
;
run;

data ce;
  set ce;
  seq_id=_n_;
run;

proc sql;
 create table ce_and_se as
 select a.*,b.epoch,b.sestdtc,b.seendtc
 from ce as a left join se as b on a.usubjid=b.usubjid;
quit;

Thank you.

Raja

Kurt_Bremser · Posted 12-02-2021 09:36 AM

Where do you see a 4 digit year here:

14-09-21

Is this 2014-09-21, or 2021-09-14? Since you use the YYMMDD informat, it will be read as the former.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

raja777pharma · Posted 12-02-2021 12:00 PM

Hi ,

Thank you very much helping on to get knowldge more in SAS data step.

Yes , we need to consider ddmmyy10.

But when i used this one for SEENDTC ddmmyy10. getting missing values , i am not sure what is the issue.

Below one is updated one.

data ce;
input usubjid $1-21 ceterm $22-78 cestdtc ddmmyy10.;
format cestdtc ddmmyy10.;
datalines;
D169CC00001/E0201004 CV DEATH                                                 14-09-2021
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-2020
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 03-09-2021
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-2020
D169CC00001/E0201007 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 20-05-2021
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-2020
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-2020
;
run;

data se;
input usubjid $1-21 EPOCH $22-39 SESTDTC ddmmyy10. SEENDTC ddmmyy10.;
format SESTDTC SEENDTC ddmmyy10.;
datalines;
D169CC00001/E0201004 SCREENING         07-02-2019 12-02-2019
D169CC00001/E0201004 BLINDED TREATMENT 12-02-2019 15-09-2019
D169CC00001/E0201007 SCREENING         01-04-2019 08-04-2019
D169CC00001/E0201007 BLINDED TREATMENT 08-04-2019
D169CC00001/E0205038 SCREENING         12-09-2019 19-09-2019
D169CC00001/E0205038 BLINDED TREATMENT 19-09-2019 20-02-2019
D169CC00001/E0205038 FOLLOW-UP         20-02-2019
D169CC00001/E0201025 SCREENING         26-09-2019 03-10-2019
D169CC00001/E0201025 BLINDED TREATMENT 03-10-2019 08-01-2019
D169CC00001/E0201025 FOLLOW-UP         08-01-2019
;
run;

Thank you,

Raja

Kurt_Bremser · Posted 12-02-2021 01:48 PM

Use the colon (:) modifier for the date formats, or INPUT will not honor the delimiter.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

raja777pharma · Posted 12-02-2021 10:18 PM

Hi ,

Thank you now seendtc have values.

Please find the below code

data ce;
input usubjid $1-21 ceterm $22-78 cestdtc ddmmyy10.;
format cestdtc ddmmyy10.;
datalines;
D169CC00001/E0201004 CV DEATH                                                 14-09-2021
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-2020
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 03-09-2021
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-2020
D169CC00001/E0201007 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 20-05-2021
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-2020
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-2020
;
run;

data se;
input usubjid $1-21 EPOCH $22-39 SESTDTC ddmmyy10. SEENDTC :ddmmyy10.;
format SESTDTC SEENDTC ddmmyy10.;
datalines;
D169CC00001/E0201004 SCREENING         07-02-2019 12-02-2019
D169CC00001/E0201004 BLINDED TREATMENT 12-02-2019 15-09-2019
D169CC00001/E0201007 SCREENING         01-04-2019 08-04-2019
D169CC00001/E0201007 BLINDED TREATMENT 08-04-2019
D169CC00001/E0205038 SCREENING         12-09-2019 19-09-2019
D169CC00001/E0205038 BLINDED TREATMENT 19-09-2019 20-02-2019
D169CC00001/E0205038 FOLLOW-UP         20-02-2019
D169CC00001/E0201025 SCREENING         26-09-2019 03-10-2019
D169CC00001/E0201025 BLINDED TREATMENT 03-10-2019 08-01-2019
D169CC00001/E0201025 FOLLOW-UP         08-01-2019
;
run;

data ce;
  set ce;
  seq_id=_n_;
run;

proc sql;
 create table ce_and_se as
 select a.*,b.epoch,b.sestdtc,b.seendtc
 from ce as a left join se as b on a.usubjid=b.usubjid;
quit;

Thank you,

Raja.

ChrisNZ · Posted 12-01-2021 11:11 PM

data output;
 set input;
by seq_id usubjid strvard sestd seend;
 if first.seq_id then do;       
	    	if sestd ^= . and strvard ^= .  then do;
	      		if sestd <= strvard and  strvard < seend then do;
	      		    flag=1;
	       			output;
	      		end;   		 
            end;
       end;
run;

Since you force the output statement, your code will only output records that match:

first.seq_id and sestd ^= . and strvard ^= . and sestd <= strvard and strvard < seend

Note that

sestd <= strvard and strvard < seend

can be written

sestd <= strvard < seend

High-Performance SAS Coding - Third Edition

SAS Training: Just a Click Away