- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Dear Experts,
I have a scenario where i need to check the current values with the next value, I need to identify those subject whose values has 1 followed by 2 , if the subject have value 2 followed by 1 then i should exclude them.
please consider the sample data like below
data have;
input subjid value;
cards;
101 1
101 .
101 1
101 2
101 1
102 1
102 2
102 .
102 1
103 2
103 1
103 .
103 1
;
proc sort data=have;
by subjid ;
run;
data want;
set have;
by subjid ;
retain first ;
if first.subjid then do;first=.;end;
if value ne . then first=value;
run;
data want2;
obs=_n_+1;
set want end=eof;
by subjid;
if nobs>=obs then set want(keep=first rename=(first=nextfirst)) point=obs nobs=nobs;
run;
data want2;
set want2;
by subjid;
if not last.subjid and first=1 and nextfirst=2 then output;
run;
I wrote the above code which gets me the expected output, however i am looking for a better approach.
Appreciate your responses.
Thanks,
Jag
Jag
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello Jag,
Try the LAG() function as shown below:
data have;
set have;
if value = 2 and lag1(value) = 1 then output;
run;
This writes 3 records to the output.
101 2
102 2
103 2
Is this what you are looking for ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes, thank you so much for the thought.
This works but here is the code which gives me the expected output. I need to get 101 and 102 subjects as these are the only subjects with value variable 1 followed by 2.
proc sort data=have;
by subjid ;
run;
data want;
set have;
by subjid ;
retain first ;
if first.subjid then do;first=.;end;
if value ne . then first=value;
if last.subjid then first=.;
if first=2 and lag(first)=1;
run;
Thanks,
Jag
Jag
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What about:
proc sql; create table WANT as select distinct SUBJID from HAVE A where exists (select SUBJID from HAVE where SUBJID=A.SUBJID and VALUE=1) and exists (select SUBJID from HAVE where SUBJID=A.SUBJID and VALUE=2); quit;
As there is no logical sequence in the data, I can only say there is a 1 and a 2. If there is other data such as datetime, length you can easily pull that up as min(datetime) where > A.datetime, but I am just guessing here as the test data doesn't show it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
One thing that's not clear about your question ... if you are keeping a SUBJID, which observations do you want to keep? I'll assume that you want to keep all of them. Here's one approach:
data want;
keepme='N';
do until (last.subjid);
set have;
by subjid;
if value=1 then found_1='Y';
if value=2 and found_1='Y' then keepme='Y';
end;
do until (last.subjid);
set have;
by subjid;
if keepme='Y' then output;
end;
drop found_1 keepme;
run;