data have;
input id test$ visit$ vsdate$ trtdate$ results;
datalines;
101 Weight Trt1 10jan2020 27Jan2020 22
101 Weight Trt2 22jan2020 27Jan2020 .
101 Weight Trt3 23jan2020 27Jan2020 23
101 Weight Trt4 27jan2020 27Jan2020 11
101 Weight Trt5 28jan2020 27Jan2020 98.5
101 Weight Trt6 15feb2020 27Jan2020 131
102 Weight Trt1 10jan2020 11Feb2020 261
102 Weight Trt2 22jan2020 11Feb2020 10
102 Weight Trt3 23jan2020 11Feb2020 45.345
102 Weight Trt4 15feb2020 11Feb2020 33
;
run;
i have this data. Now i need to create a new variable baseline flag with value Y based on these conditions
1. vsdate should be le trtdate
2. results not missing
then
3. for latest visit baseline flag should be Y
output should be liike this:
101 Weight Trt1 10jan2020 27Jan2020 22
101 Weight Trt3 23jan2020 27Jan2020 23
101 Weight Trt4 27jan2020 27Jan2020 11 Y
102 Weight Trt1 10jan2020 11Feb2020 261
102 Weight Trt2 22jan2020 11Feb2020 10
102 Weight Trt3 23jan2020 11Feb2020 45.345 Y
Why are you reading the dates as char type variables? why not as sas numeric dates?
data have;
input id test $ visit $ vsdate date9. trtdate date9. results;
datalines;
101 Weight Trt1 10jan2020 27Jan2020 22
101 Weight Trt2 22jan2020 27Jan2020 .
101 Weight Trt3 23jan2020 27Jan2020 23
101 Weight Trt4 27jan2020 27Jan2020 11
101 Weight Trt5 28jan2020 27Jan2020 98.5
101 Weight Trt6 15feb2020 27Jan2020 131
102 Weight Trt1 10jan2020 11Feb2020 261
102 Weight Trt2 22jan2020 11Feb2020 10
102 Weight Trt3 23jan2020 11Feb2020 45.345
102 Weight Trt4 15feb2020 11Feb2020 33
;
run;
alternatively you need a second step to convert the dates into sas numeric dates:
data new;
set have(rename=(vsdate=dt1 trtdate=dt2));
vsdate = input(dt1, date9.);
trtdate = input(dt2,date9.);
run;
having numeric dates it will be easy to do what you need:
proc sort data=have(where=(vsdate < trtdate and results ne .)) /* or data=new */
out=temp;
by ID visit vsdate;
run;
data want;
set temp;
by ID visit;
if last.visit then flag ="Y";
run;
Remember that a numeric sas date is counting days since 01Jan1960 as 0;
to clarify it run next code
data _null_;
dt = '01JAN1960'd;
td = today();
put dt= td=;
run;
Why are you reading the dates as char type variables? why not as sas numeric dates?
data have;
input id test $ visit $ vsdate date9. trtdate date9. results;
datalines;
101 Weight Trt1 10jan2020 27Jan2020 22
101 Weight Trt2 22jan2020 27Jan2020 .
101 Weight Trt3 23jan2020 27Jan2020 23
101 Weight Trt4 27jan2020 27Jan2020 11
101 Weight Trt5 28jan2020 27Jan2020 98.5
101 Weight Trt6 15feb2020 27Jan2020 131
102 Weight Trt1 10jan2020 11Feb2020 261
102 Weight Trt2 22jan2020 11Feb2020 10
102 Weight Trt3 23jan2020 11Feb2020 45.345
102 Weight Trt4 15feb2020 11Feb2020 33
;
run;
alternatively you need a second step to convert the dates into sas numeric dates:
data new;
set have(rename=(vsdate=dt1 trtdate=dt2));
vsdate = input(dt1, date9.);
trtdate = input(dt2,date9.);
run;
having numeric dates it will be easy to do what you need:
proc sort data=have(where=(vsdate < trtdate and results ne .)) /* or data=new */
out=temp;
by ID visit vsdate;
run;
data want;
set temp;
by ID visit;
if last.visit then flag ="Y";
run;
Remember that a numeric sas date is counting days since 01Jan1960 as 0;
to clarify it run next code
data _null_;
dt = '01JAN1960'd;
td = today();
put dt= td=;
run;
Thank you! its working. but I want to keep all the records in the output , if we use where option only particular set of observations are being selected. can you please help me with that?
And is there any alternative solution for this? Thank you.
You asked to remove those observation with vsdate >trtdate, as in your wanted result:
output should be liike this: 101 Weight Trt1 10jan2020 27Jan2020 22 101 Weight Trt3 23jan2020 27Jan2020 23 101 Weight Trt4 27jan2020 27Jan2020 11 Y 102 Weight Trt1 10jan2020 11Feb2020 261 102 Weight Trt2 22jan2020 11Feb2020 10 102 Weight Trt3 23jan2020 11Feb2020 45.345 Y
thats what the where statement do.
Sorry i have the soultion you gave worked very well, i was trying the same will keeping all the observations and getting the output. Sorry for the confusion.
You can possibly "put back" observations removed, by joining the original table (have) with processed one (want) into want1
proc sql;
create table want1 as
select a.*, b.baseline_flag
from have a
left join want b
on a.Id = b.id and
a.vsdate = b.vsdate and
a.trtdate = b.trtdate;
quit;
sql code which may help
proc sql;
create table want as
select * , case
when vsdate eq max(vsdate)
then 'Y'
else ''
end as baseline_flag
from have
where vsdate le trtdate and
vsdate is not missing and
trtdate is not missing
group by id;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.