Hi there,
I am trying to assign First and Last to a row that meets a number of conditions. I have sorted the table by ID# and Location and Key Date.
A row must meet all 3 conditions (A,B,C) = 'Yes', otherwise it will skip to the next row (within the group ID# and Location) to assign First or Last. Is there a way to do this?
ID# | Location | Key Date | Key Time | Condition A | Condition B | Condition C | First | Last |
1 | 1 | 1/1/2021 | 8:00 AM | Yes | Yes | Yes | 1 | 0 |
2 | 1 | 1/1/2021 | 8:05 AM | Yes | Yes | Yes | 0 | 1 |
3 | 2 | 1/1/2021 | 9:00AM | No | Yes | No | 0 | 0 |
4 | 2 | 1/1/2021 | 9:05AM | Yes | Yes | Yes | 1 | 0 |
5 | 2 | 1/1/2021 | 9:08AM | Yes | Yes | Yes | 0 | 1 |
6 | 3 | 1/2/2021 | 9:00AM | Yes | Yes | Yes | 1 | 0 |
7 | 3 | 1/2/2021 | 9:15AM | Yes | Yes | No | 0 | 0 |
8 | 3 | 1/2/2021 | 9:10AM | Yes | Yes | Yes | 0 | 1 |
I'm not clear on what you are attempting to do, for example the 1st observations first=1 last=0, why isn't last=1 as that's the only occurrence where ID=1. In fact all of the observations where Condition A, B & C = Yes, should have both first=1 and last=1 as all the ID values are unique. At least that's how I understand your description.
You are probably going to need to use FIRST. and LAST. DATA Step Variables in some capacity
You can use BY group processing to do most of it.
data want;
set have;
by keydate location conditiona conditionb conditionc NOTSORTED;
first = first.conditionc;
last = last.conditionc;
run;
You seem to want to clear the FIRST and LAST flags when at least one of the conditions are not met.
data want;
set have;
by keydate location conditiona conditionb conditionc NOTSORTED.
first = first.conditionc and 3=count('Yes',cats(conditiona,conditionb,conditionc));
last = last.conditionc and 3=count('Yes',cats(conditiona,conditionb,conditionc));
run;
But it might be better to make a separate flag variable to indicate whether or not the condition was met.
data want;
set have;
by keydate location conditiona conditionb conditionc NOTSORTED;
all_yes = 3=count('Yes',cats(conditiona,conditionb,conditionc));
first = first.conditionc ;
last = last.conditionc ;
run;
PS Do not use spaces in your variable names. It makes the code impossible to read and type.
This example is difficult to follow, and I'm not sure if there are other idiosyncrasies that need to be captured. This seems to output what you want, and I don't get any errors in my log.
data have;
infile datalines delimiter = ',';
input id $ location $ key_date :mmddyy10. key_time :time5. condition_a $ condition_b $ condition_c $;
format key_date mmddyy10. key_time time5.;
datalines;
1,1,1/1/2021,8:00 AM,Yes,Yes,Yes
2,1,1/1/2021,8:05 AM,Yes,Yes,Yes
3,2,1/1/2021,9:00 AM,No,Yes,No
4,2,1/1/2021,9:05 AM,Yes,Yes,Yes
5,2,1/1/2021,9:08 AM,Yes,Yes,Yes
6,3,1/2/2021,9:00 AM,Yes,Yes,Yes
7,3,1/2/2021,9:15 AM,Yes,Yes,No
8,3,1/2/2021,9:10 AM,Yes,Yes,Yes
;
run;
data have_2;
set have;
if condition_a = 'Yes' and condition_b = 'Yes' and condition_c = 'Yes' then yes = 1;
else yes = 0;
run;
proc sort data = have_2;
by location yes id;
run;
data want (drop = yes);
set have_2;
by location yes;
if first.yes and yes = 1 then first = 1;
else first = 0;
if last.yes and yes = 1then last = 1;
else last = 0;
run;
proc sort data = want;
by id location key_date;
run;
I would generally advise against naming variables that conflict with any SAS functions/automatic variables.
First determine first and last in a filtered dataset, then merge back:
data first_last;
set have (where=(condition_a = "yes" and condition_b = "yes" and condition_c = "yes"));
by id location;
first = first.location;
last = last.location;
drop condition_:;
run;
data want;
merge
have
first_last (in=fl)
;
by id location key_date key_time;
if not fl
then do;
first = 0
last = 0;
end;
run;
Untested; for tested code, provide usable example data in a data step with datalines.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.