Hello ,
I am unable to understand how below code is working.
can any one alternative code will provide with easy understand with same logic.
data ss3_epoch_6;
set epoch2;
by seq_id usubjid strvard sestd seend;
retain outfl;
if first.seq_id then outfl = 0;
if strvard >= sestd and seend > strvard and sestd ne . and strvard ne . and outfl = 0 then do;
output;
outfl = 1;
end;
* If strtvar does not fit any epoch, set it to missing and output;
if last.seq_id and outfl = 0 then do;
* Fix for ongoing study where the end date for the last epoch is still missing.;
if seend = . and strvard ne . then output;
else do;
EPOCH = "";
if strvard ne . then PUTLOG "(epoch) STAR_CHECK: Provided date outside all epochs in sdtm.SE for subject: "
USUBJID= N=;
output;
end;
Thank you,
Rajasekhar.
For every seq_id, the first observation that meets this condition
strvard >= sestd and seend > strvard and sestd ne . and strvard ne . and
is output; if no such observation is found, the last observation is output, with EPOCH set to a missing value if another condition is met.
I don't see how the code could be simplified.
Better formatting (indentation) might help you to understand the logic.
I don't know the data, so can't comment further, but the coder did a good job.
It's your opportunity for growth it seems 🙂
Hi , Below is example data sorry i am not able to write the data here.
For below code is output two obervations.
data out;
set input;
by seq_id usubjid strvard sestd seend;
retain outfl;
if first.seq_id then outfl = 0;
if strvard >= sestd and seend > strvard and sestd ne . and strvard ne . and outfl = 0 then do;
output;
outfl = 1;
end;
run;
output;
I wrote the same logic but no observations are output with below code , please help me to understand what i am missing , aslo please help to write different from above.
data output;
set input;
by seq_id usubjid strvard sestd seend;
if first.seq_id then do;
if sestd ^= . and strvard ^= . then do;
if sestd <= strvard and strvard < seend then do;
flag=1;
output;
end;
end;
end;
run;
Thank you,
Rajasekhar.
Post usable data, like tghis:
data have;
input seq_id epoch :$20. sestdtc :yymmdd10.;
format sestdtc yymmdd10.;
datalines;
199 SCREENING 2019-09-24
;
Add additional variables and observations as needed to illustrate your issue.
Hi ,
Thank you very much for help.
I have try te get some sammple data here and expalined rules and output.
Plese help me code with differently previosu code.
data ce;
input usubjid :$40 ceterm :$200 cestdtc :yymmdd10.;
format cestdtc yymmdd10.;
datalines;
D169CC00001/E0201004 CV DEATH 14-09-21
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-20
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 03-09-21
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-20
D169CC00001/E0201007 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 20-05-21
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-20
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-20
;
run;
data se;
input usubjid :$40 EPOCH :$200 SESTDTC :yymmdd10. SEENDTC :yymmdd10.;
format SESTDTC SEENDTC yymmdd10.;
datalines;
USUBJID EPOCH SESTDTC SEENDTC
D169CC00001/E0201004 SCREENING 07-02-19 12-02-19
D169CC00001/E0201004 BLINDED TREATMENT 12-02-19 15-09-21
D169CC00001/E0201007 SCREENING 01-04-19 08-04-19
D169CC00001/E0201007 BLINDED TREATMENT 08-04-19
D169CC00001/E0205038 SCREENING 12-09-19 19-09-19
D169CC00001/E0205038 BLINDED TREATMENT 19-09-19 20-02-20
D169CC00001/E0205038 FOLLOW-UP 20-02-20
D169CC00001/E0201025 SCREENING 26-09-19 03-10-19
D169CC00001/E0201025 BLINDED TREATMENT 03-10-19 08-01-20
D169CC00001/E0201025 FOLLOW-UP 08-01-20
;
run;
/* From Above two data set are left joined by below code. */
/*before left join created one unique id in CE data */
/*for final output records same as first CE data */
data ce;
set ce;
seq_id=_n_;
run;
proc sql;
create table ce_and_se as
select a.*,b.epoch,b.sestdtc,b.seendtc
from ce as a left join se as before on a.usubjid=b.usubjid;
quit;
Output data set as below:
1. Output data set have same observation from first CE data.
2. EPOCH values will be assinged based on date when SESTDTC <= cestdtc < SEENDTC.
3. if cestdtc is missing then epoch will missing values.
4. if SESTDTC <= cestdtc < SEENDTC this logic not fall then print those values to log (only when cestdtc and SESTDTC and SEENDTC not missing)
/*Final output like below:*/
/*data have */
USUBJID ceterm cestdtc EPOCH
D169CC00001/E0201004 CV DEATH 14-09-21
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-20 BLINDED TREATMENT
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 03-09-21 BLINDED TREATMENT
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-20 BLINDED TREATMENT
D169CC00001/E0201007 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 20-05-21 BLINDED TREATMENT
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-20 FOLLOW-UP
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-20 FOLLOW-UP
Thank you.
Rajasekhar.
Please review your codes.
Make sure that your DATA steps work without ERRORs or invalid data messages in the log.
Are you sure you want to use a YYMMDD informat and not DDMMYY?
Why do you expect those FOLLOW-UP observations to match when their SEENDTC is missing?
Also, ALWAYS (as in ALWAYS) use 4-digit years. After the Y2K scare this should be an unquestioned given in computing.
Hi ,
Format of the date variibles would be fine YYMMDD10.
For followup record below condition is met, so assinging the values , if SEENDTC is value then will check the cestdtc < SEENDTC
SESTDTC <= cestdtc
Thank you,
Raja
This clarifies exactly NOTHING.
Please re-post your codes, tested, with 4-digit years. Make sure to use proper delimiters or the & informat modifier so that the strings with blanks are read correctly. I will not answer further until you do that.
Hello ,
Thank you for help.
I have exceuted the below code and data sets were genrated without error.
Also date year is displayed 4 digits in data set.
data ce;
input usubjid $1-21 ceterm $22-78 cestdtc yymmdd10.;
format cestdtc yymmdd10.;
datalines;
D169CC00001/E0201004 CV DEATH 14-09-21
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-20
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 03-09-21
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-20
D169CC00001/E0201007 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 20-05-21
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-20
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-20
;
run;
data se;
input usubjid $1-21 EPOCH $22-39 SESTDTC yyyymmdd10. SEENDTC yyyymmdd10.;
format SESTDTC SEENDTC yyyymmdd10.;
datalines;
D169CC00001/E0201004 SCREENING 07-02-19 12-02-19
D169CC00001/E0201004 BLINDED TREATMENT 12-02-19 15-09-21
D169CC00001/E0201007 SCREENING 01-04-19 08-04-19
D169CC00001/E0201007 BLINDED TREATMENT 08-04-19
D169CC00001/E0205038 SCREENING 12-09-19 19-09-19
D169CC00001/E0205038 BLINDED TREATMENT 19-09-19 20-02-20
D169CC00001/E0205038 FOLLOW-UP 20-02-20
D169CC00001/E0201025 SCREENING 26-09-19 03-10-19
D169CC00001/E0201025 BLINDED TREATMENT 03-10-19 08-01-20
D169CC00001/E0201025 FOLLOW-UP 08-01-20
;
run;
data ce;
set ce;
seq_id=_n_;
run;
proc sql;
create table ce_and_se as
select a.*,b.epoch,b.sestdtc,b.seendtc
from ce as a left join se as b on a.usubjid=b.usubjid;
quit;
Thank you.
Raja
Where do you see a 4 digit year here:
14-09-21
Is this 2014-09-21, or 2021-09-14? Since you use the YYMMDD informat, it will be read as the former.
Hi ,
Thank you very much helping on to get knowldge more in SAS data step.
Yes , we need to consider ddmmyy10.
But when i used this one for SEENDTC ddmmyy10. getting missing values , i am not sure what is the issue.
Below one is updated one.
data ce;
input usubjid $1-21 ceterm $22-78 cestdtc ddmmyy10.;
format cestdtc ddmmyy10.;
datalines;
D169CC00001/E0201004 CV DEATH 14-09-2021
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-2020
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 03-09-2021
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-2020
D169CC00001/E0201007 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 20-05-2021
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-2020
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-2020
;
run;
data se;
input usubjid $1-21 EPOCH $22-39 SESTDTC ddmmyy10. SEENDTC ddmmyy10.;
format SESTDTC SEENDTC ddmmyy10.;
datalines;
D169CC00001/E0201004 SCREENING 07-02-2019 12-02-2019
D169CC00001/E0201004 BLINDED TREATMENT 12-02-2019 15-09-2019
D169CC00001/E0201007 SCREENING 01-04-2019 08-04-2019
D169CC00001/E0201007 BLINDED TREATMENT 08-04-2019
D169CC00001/E0205038 SCREENING 12-09-2019 19-09-2019
D169CC00001/E0205038 BLINDED TREATMENT 19-09-2019 20-02-2019
D169CC00001/E0205038 FOLLOW-UP 20-02-2019
D169CC00001/E0201025 SCREENING 26-09-2019 03-10-2019
D169CC00001/E0201025 BLINDED TREATMENT 03-10-2019 08-01-2019
D169CC00001/E0201025 FOLLOW-UP 08-01-2019
;
run;
Thank you,
Raja
Use the colon (:) modifier for the date formats, or INPUT will not honor the delimiter.
Hi ,
Thank you now seendtc have values.
Please find the below code
data ce;
input usubjid $1-21 ceterm $22-78 cestdtc ddmmyy10.;
format cestdtc ddmmyy10.;
datalines;
D169CC00001/E0201004 CV DEATH 14-09-2021
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-2020
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 03-09-2021
D169CC00001/E0201004 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-02-2020
D169CC00001/E0201007 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 20-05-2021
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-2020
D169CC00001/E0201025 HEART FAILURE HOSPITALIZATION/URGENT HEART FAILURE VISIT 04-06-2020
;
run;
data se;
input usubjid $1-21 EPOCH $22-39 SESTDTC ddmmyy10. SEENDTC :ddmmyy10.;
format SESTDTC SEENDTC ddmmyy10.;
datalines;
D169CC00001/E0201004 SCREENING 07-02-2019 12-02-2019
D169CC00001/E0201004 BLINDED TREATMENT 12-02-2019 15-09-2019
D169CC00001/E0201007 SCREENING 01-04-2019 08-04-2019
D169CC00001/E0201007 BLINDED TREATMENT 08-04-2019
D169CC00001/E0205038 SCREENING 12-09-2019 19-09-2019
D169CC00001/E0205038 BLINDED TREATMENT 19-09-2019 20-02-2019
D169CC00001/E0205038 FOLLOW-UP 20-02-2019
D169CC00001/E0201025 SCREENING 26-09-2019 03-10-2019
D169CC00001/E0201025 BLINDED TREATMENT 03-10-2019 08-01-2019
D169CC00001/E0201025 FOLLOW-UP 08-01-2019
;
run;
data ce;
set ce;
seq_id=_n_;
run;
proc sql;
create table ce_and_se as
select a.*,b.epoch,b.sestdtc,b.seendtc
from ce as a left join se as b on a.usubjid=b.usubjid;
quit;
Thank you,
Raja.
data output; set input; by seq_id usubjid strvard sestd seend; if first.seq_id then do; if sestd ^= . and strvard ^= . then do; if sestd <= strvard and strvard < seend then do; flag=1; output; end; end; end; run;
Since you force the output statement, your code will only output records that match:
first.seq_id and sestd ^= . and strvard ^= . and sestd <= strvard and strvard < seend
Note that
sestd <= strvard and strvard < seend
can be written
sestd <= strvard < seend
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.