I have a dataset where it is having multiple timepoints and multiple form sequence for the same subject id and i need to output those records where the form sequence is same for the subject but the timepoint is different, here as per the requirement if the form sequence is same for a subject the timepoint as as well must be same, if it is different it must be in the output. I have been trying with lag function and also using first. but i found that jot useful and not able to achieve the result. any help please.
example datasets as below
data ndsn;
infile datalines;
input timepoint formseq usubjid $15.;
datalines;
1 1 111A116
1 1 111A116
1 1 111A116
5 5 111A116
4 5 111A116
5 5 111A116
8 8 111A116
8 8 111A116
6 6 111A116
6 6 111A116
3 6 111A116
;
run;
expected output
data ndsn;
infile datalines;
input timepoint formseq usubjid $15.;
datalines;
4 5 111A116
3 6 111A116
;
run;
I had tried lag function and also using first. but they didn't provide expected result, i thought of using if then but that is taking long time, can i know a solution for this
LAG and FIRST. are OK as an idea, it's how you implement them:
data want;
set ndsn;
by usubjid formseq notsorted;
retain flag;
if first.formseq then flag = 1;
if not first.formseq and timepoint ne lag(timepoint)
then do;
if flag then output;
flag = 0;
end;
drop flag;
run;
LAG and FIRST. are OK as an idea, it's how you implement them:
data want;
set ndsn;
by usubjid formseq notsorted;
retain flag;
if first.formseq then flag = 1;
if not first.formseq and timepoint ne lag(timepoint)
then do;
if flag then output;
flag = 0;
end;
drop flag;
run;
It is because in your example formseq 6 follows formseq 8. Which means that I either have to use the NOTSORTED option or use PROC SORT first to get the values in order.
data ndsn;
infile datalines;
input timepoint formseq usubjid $15.;
datalines;
1 1 111A116
1 1 111A116
1 1 111A116
5 5 111A116
4 5 111A116
5 5 111A116
8 8 111A116
8 8 111A116
6 6 111A116
6 6 111A116
3 6 111A116
;
run;
data want;
do until(last.formseq);
set ndsn;
by formseq notsorted;
if first.formseq then _timepoint=timepoint;
if _timepoint ne timepoint then output;
end;
drop _timepoint;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.