Hello ,
I am unable to understand how below code is working.
can any one alternative code will provide with easy understand with same logic.
data ss3_epoch_6;
set epoch2;
by seq_id usubjid strvard sestd seend;
retain outfl;
if first.seq_id then outfl = 0;
if strvard >= sestd and seend > strvard and sestd ne . and strvard ne . and outfl = 0 then do;
output;
outfl = 1;
end;
* If strtvar does not fit any epoch, set it to missing and output;
if last.seq_id and outfl = 0 then do;
* Fix for ongoing study where the end date for the last epoch is still missing.;
if seend = . and strvard ne . then output;
else do;
EPOCH = "";
if strvard ne . then PUTLOG "(epoch) STAR_CHECK: Provided date outside all epochs in sdtm.SE for subject: "
USUBJID= N=;
output;
end;
Thank you,
Rajasekhar.
For every seq_id, the first observation that meets this condition
strvard >= sestd and seend > strvard and sestd ne . and strvard ne . and
is output; if no such observation is found, the last observation is output, with EPOCH set to a missing value if another condition is met.
I don't see how the code could be simplified.
Better formatting (indentation) might help you to understand the logic.
I don't know the data, so can't comment further, but the coder did a good job.
It's your opportunity for growth it seems 🙂
Hi , Below is example data sorry i am not able to write the data here.
For below code is output two obervations.
data out;
set input;
by seq_id usubjid strvard sestd seend;
retain outfl;
if first.seq_id then outfl = 0;
if strvard >= sestd and seend > strvard and sestd ne . and strvard ne . and outfl = 0 then do;
output;
outfl = 1;
end;
run;
output;
I wrote the same logic but no observations are output with below code , please help me to understand what i am missing , aslo please help to write different from above.
data output;
set input;
by seq_id usubjid strvard sestd seend;
if first.seq_id then do;
if sestd ^= . and strvard ^= . then do;
if sestd <= strvard and strvard < seend then do;
flag=1;
output;
end;
end;
end;
run;
Thank you,
Rajasekhar.
Post usable data, like tghis:
data have;
input seq_id epoch :$20. sestdtc :yymmdd10.;
format sestdtc yymmdd10.;
datalines;
199 SCREENING 2019-09-24
;
Add additional variables and observations as needed to illustrate your issue.
Hi ,
Thank you very much for help.
I have try te get some sammple data here and expalined rules and output.
Plese help me code with differently previosu code.
data ce;
input usubjid :$40 ceterm :$200 cestdtc :yymmdd10.;
format cestdtc yymmdd10.;
datalines;
D169CC00001/E0201004 CV DEATH 14-09-21
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-20
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 03-09-21
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-20
D169CC00001/E0201007 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 20-05-21
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-20
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-20
;
run;
data se;
input usubjid :$40 EPOCH :$200 SESTDTC :yymmdd10. SEENDTC :yymmdd10.;
format SESTDTC SEENDTC yymmdd10.;
datalines;
USUBJID EPOCH SESTDTC SEENDTC
D169CC00001/E0201004 SCREENING 07-02-19 12-02-19
D169CC00001/E0201004 BLINDED TREATMENT 12-02-19 15-09-21
D169CC00001/E0201007 SCREENING 01-04-19 08-04-19
D169CC00001/E0201007 BLINDED TREATMENT 08-04-19
D169CC00001/E0205038 SCREENING 12-09-19 19-09-19
D169CC00001/E0205038 BLINDED TREATMENT 19-09-19 20-02-20
D169CC00001/E0205038 FOLLOW-UP 20-02-20
D169CC00001/E0201025 SCREENING 26-09-19 03-10-19
D169CC00001/E0201025 BLINDED TREATMENT 03-10-19 08-01-20
D169CC00001/E0201025 FOLLOW-UP 08-01-20
;
run;
/* From Above two data set are left joined by below code. */
/*before left join created one unique id in CE data */
/*for final output records same as first CE data */
data ce;
set ce;
seq_id=_n_;
run;
proc sql;
create table ce_and_se as
select a.*,b.epoch,b.sestdtc,b.seendtc
from ce as a left join se as before on a.usubjid=b.usubjid;
quit;
Output data set as below:
1. Output data set have same observation from first CE data.
2. EPOCH values will be assinged based on date when SESTDTC <= cestdtc < SEENDTC.
3. if cestdtc is missing then epoch will missing values.
4. if SESTDTC <= cestdtc < SEENDTC this logic not fall then print those values to log (only when cestdtc and SESTDTC and SEENDTC not missing)
/*Final output like below:*/
/*data have */
USUBJID ceterm cestdtc EPOCH
D169CC00001/E0201004 CV DEATH 14-09-21
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-20 BLINDED TREATMENT
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 03-09-21 BLINDED TREATMENT
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-20 BLINDED TREATMENT
D169CC00001/E0201007 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 20-05-21 BLINDED TREATMENT
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-20 FOLLOW-UP
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-20 FOLLOW-UP
Thank you.
Rajasekhar.
Please review your codes.
Make sure that your DATA steps work without ERRORs or invalid data messages in the log.
Are you sure you want to use a YYMMDD informat and not DDMMYY?
Why do you expect those FOLLOW-UP observations to match when their SEENDTC is missing?
Also, ALWAYS (as in ALWAYS) use 4-digit years. After the Y2K scare this should be an unquestioned given in computing.
Hi ,
Format of the date variibles would be fine YYMMDD10.
For followup record below condition is met, so assinging the values , if SEENDTC is value then will check the cestdtc < SEENDTC
SESTDTC <= cestdtc
Thank you,
Raja
This clarifies exactly NOTHING.
Please re-post your codes, tested, with 4-digit years. Make sure to use proper delimiters or the & informat modifier so that the strings with blanks are read correctly. I will not answer further until you do that.
Hello ,
Thank you for help.
I have exceuted the below code and data sets were genrated without error.
Also date year is displayed 4 digits in data set.
data ce;
input usubjid $1-21 ceterm $22-78 cestdtc yymmdd10.;
format cestdtc yymmdd10.;
datalines;
D169CC00001/E0201004 CV DEATH 14-09-21
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-20
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 03-09-21
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-20
D169CC00001/E0201007 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 20-05-21
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-20
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-20
;
run;
data se;
input usubjid $1-21 EPOCH $22-39 SESTDTC yyyymmdd10. SEENDTC yyyymmdd10.;
format SESTDTC SEENDTC yyyymmdd10.;
datalines;
D169CC00001/E0201004 SCREENING 07-02-19 12-02-19
D169CC00001/E0201004 BLINDED TREATMENT 12-02-19 15-09-21
D169CC00001/E0201007 SCREENING 01-04-19 08-04-19
D169CC00001/E0201007 BLINDED TREATMENT 08-04-19
D169CC00001/E0205038 SCREENING 12-09-19 19-09-19
D169CC00001/E0205038 BLINDED TREATMENT 19-09-19 20-02-20
D169CC00001/E0205038 FOLLOW-UP 20-02-20
D169CC00001/E0201025 SCREENING 26-09-19 03-10-19
D169CC00001/E0201025 BLINDED TREATMENT 03-10-19 08-01-20
D169CC00001/E0201025 FOLLOW-UP 08-01-20
;
run;
data ce;
set ce;
seq_id=_n_;
run;
proc sql;
create table ce_and_se as
select a.*,b.epoch,b.sestdtc,b.seendtc
from ce as a left join se as b on a.usubjid=b.usubjid;
quit;
Thank you.
Raja
Where do you see a 4 digit year here:
14-09-21
Is this 2014-09-21, or 2021-09-14? Since you use the YYMMDD informat, it will be read as the former.
Hi ,
Thank you very much helping on to get knowldge more in SAS data step.
Yes , we need to consider ddmmyy10.
But when i used this one for SEENDTC ddmmyy10. getting missing values , i am not sure what is the issue.
Below one is updated one.
data ce;
input usubjid $1-21 ceterm $22-78 cestdtc ddmmyy10.;
format cestdtc ddmmyy10.;
datalines;
D169CC00001/E0201004 CV DEATH 14-09-2021
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-2020
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 03-09-2021
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-2020
D169CC00001/E0201007 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 20-05-2021
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-2020
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-2020
;
run;
data se;
input usubjid $1-21 EPOCH $22-39 SESTDTC ddmmyy10. SEENDTC ddmmyy10.;
format SESTDTC SEENDTC ddmmyy10.;
datalines;
D169CC00001/E0201004 SCREENING 07-02-2019 12-02-2019
D169CC00001/E0201004 BLINDED TREATMENT 12-02-2019 15-09-2019
D169CC00001/E0201007 SCREENING 01-04-2019 08-04-2019
D169CC00001/E0201007 BLINDED TREATMENT 08-04-2019
D169CC00001/E0205038 SCREENING 12-09-2019 19-09-2019
D169CC00001/E0205038 BLINDED TREATMENT 19-09-2019 20-02-2019
D169CC00001/E0205038 FOLLOW-UP 20-02-2019
D169CC00001/E0201025 SCREENING 26-09-2019 03-10-2019
D169CC00001/E0201025 BLINDED TREATMENT 03-10-2019 08-01-2019
D169CC00001/E0201025 FOLLOW-UP 08-01-2019
;
run;
Thank you,
Raja
Use the colon (:) modifier for the date formats, or INPUT will not honor the delimiter.
Hi ,
Thank you now seendtc have values.
Please find the below code
data ce;
input usubjid $1-21 ceterm $22-78 cestdtc ddmmyy10.;
format cestdtc ddmmyy10.;
datalines;
D169CC00001/E0201004 CV DEATH 14-09-2021
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-2020
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 03-09-2021
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-2020
D169CC00001/E0201007 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 20-05-2021
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-2020
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-2020
;
run;
data se;
input usubjid $1-21 EPOCH $22-39 SESTDTC ddmmyy10. SEENDTC :ddmmyy10.;
format SESTDTC SEENDTC ddmmyy10.;
datalines;
D169CC00001/E0201004 SCREENING 07-02-2019 12-02-2019
D169CC00001/E0201004 BLINDED TREATMENT 12-02-2019 15-09-2019
D169CC00001/E0201007 SCREENING 01-04-2019 08-04-2019
D169CC00001/E0201007 BLINDED TREATMENT 08-04-2019
D169CC00001/E0205038 SCREENING 12-09-2019 19-09-2019
D169CC00001/E0205038 BLINDED TREATMENT 19-09-2019 20-02-2019
D169CC00001/E0205038 FOLLOW-UP 20-02-2019
D169CC00001/E0201025 SCREENING 26-09-2019 03-10-2019
D169CC00001/E0201025 BLINDED TREATMENT 03-10-2019 08-01-2019
D169CC00001/E0201025 FOLLOW-UP 08-01-2019
;
run;
data ce;
set ce;
seq_id=_n_;
run;
proc sql;
create table ce_and_se as
select a.*,b.epoch,b.sestdtc,b.seendtc
from ce as a left join se as b on a.usubjid=b.usubjid;
quit;
Thank you,
Raja.
data output; set input; by seq_id usubjid strvard sestd seend; if first.seq_id then do; if sestd ^= . and strvard ^= . then do; if sestd <= strvard and strvard < seend then do; flag=1; output; end; end; end; run;
Since you force the output statement, your code will only output records that match:
first.seq_id and sestd ^= . and strvard ^= . and sestd <= strvard and strvard < seend
Note that
sestd <= strvard and strvard < seend
can be written
sestd <= strvard < seend
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.