I have a reporting situation that I cannot figure out. I have a vertical dataset, and there are typically 2 rows per visit (left and right laterality). I am checking the data for discrepancies. If a discrepancy exists for a given visit for either laterality, then I want to do output both rows regardless of whether or not the other laterality has a discrepancy.
For simplicity, here is some sample data:
data test;
input SUBJ $ VISIT LAT $ DISCREP $;
cards;
101-101 1 Left yes
101-101 1 Right yes
101-101 2 Left no
101-101 2 Right no
101-101 3 Left yes
101-101 3 Right no
101-101 4 Left no
101-101 4 Right no
101-102 1 Left no
101-102 1 Right no
101-102 2 Left yes
101-102 2 Right no
101-102 3 Left no
101-102 3 Right no
;
run;
This is what the final table should show:
SUBJ | VISIT | LAT | DISCREP |
101-101 | 1 | Left | yes |
101-101 | 1 | Right | yes |
101-101 | 3 | Left | yes |
101-101 | 3 | Right | no |
101-102 | 2 | Left | yes |
101-102 | 2 | Right | no |
Any ideas on how to do this? I'm guessing something with a RETAIN statement. I'm sure it will be extremely simple, but my brain is done for the day.
A relatively simple end-of-day solution:
data want;
wanted='N';
do until (last.visit);
set have;
by subj visit;
if discrep='yes' then wanted='Y';
end;
do until (last.visit);
set have;
by subj visit;
if wanted='Y' then output;
end;
drop wanted;
run;
The top loop examines both observations, and the bottom loop outputs either zero or both observations.
A relatively simple end-of-day solution:
data want;
wanted='N';
do until (last.visit);
set have;
by subj visit;
if discrep='yes' then wanted='Y';
end;
do until (last.visit);
set have;
by subj visit;
if wanted='Y' then output;
end;
drop wanted;
run;
The top loop examines both observations, and the bottom loop outputs either zero or both observations.
Hi Astounding,
I have a small question about the workings of your code:
In the first part when you decide to make wanted='Y' when discrep = 'yes', does your code give the value 'Y' to all of the entries within the subject-visit block if the value 'yes' appears at least once, and then in the second part output all the observations where wanted = 'Y',
Or , you give the value 'Y' only to the observation that has 'yes', and in the second part output the subject-visit block if 'Y' appears at least once.
Thanks
I'm sorry to say, the answer is that it depends.
The code changes WANTED to "Y" the first time it encounters "yes". From that point forward (to the end of the BY group), WANTED remains "Y". You can observe the behavior by adding this statement inside the first loop:
put wanted=;
More important, the top loop does not output any observations. Its key function is to change WANTED in the PDV, where SAS is storing the current value of all variables. Once WANTED becomes "Y", there is no code to change it to anything else (neither to "N" nor to a blank) until the second loop has completed.
I tried to modify the original data in the following way:
101-101 3 Left no
101-101 3 Right yes
so for the 3rd visit of subject 101-101, Left is "no" and Right is "yes" - so the code will see the "no" and then see the"yes" - but nevertheless in the final want data all the info will be displayed.
So even if SAS encounters the condition for the last observation within a specific subject-visit block, it applies it to the previous observations even if they didn't have the condition?
It doesn't really apply to the previous observations. It just sets WANTED and leaves it that way. Remember, the first DO loop doesn't output anything, it only reads some observations and calculates WANTED.
The reason you get all the observations (within the block) output is that the second DO loop reads exactly the same observations as the first DO loop. The second DO loop looks at the previously-assigned value for WANTED, when determining which observations to output.
data have;
input SUBJ $ VISIT LAT $ DISCREP $;
cards4;
101-101 1 Left yes
101-101 1 Right yes
101-101 2 Left no
101-101 2 Right no
101-101 3 Left yes
101-101 3 Right no
101-101 4 Left no
101-101 4 Right no
101-102 1 Left no
101-102 1 Right no
101-102 2 Left yes
101-102 2 Right no
101-102 3 Left no
101-102 3 Right no
;;;;
run;
data want;
retain flag .; /* not necessary but I like it for clarity */
do until (last.visit);
set have;
by visit notsorted;
if discrep='yes' then flag=1;
end;
do until (last.visit);
set have;
by visit notsorted;
if flag then output;
end;
flag=.;
run;quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.