Hi all, Here is my current data with ten columns:
co1 col2 col3 col4 col5 col6 col7 col8 col9 col10
PM MM JM MM PM PB
PM MM JM MM PM PB JM
PM MM JM MM PM MB PM JM
PM MM JM MM PM PM MM MB
Goal:
I want to flag all records which end with either 'PB' or 'MB' value and there are no values after them across columns and flag if there are values after 'PB' or 'MB'
Any help is appreciated. Thanks!
@SP01 wrote:
I want to flag all records which end with either 'PB' or 'MB' value and there are no values after them across columns and flag if there are values after 'PB' or 'MB'
So if I am understanding this and also if you wrote it properly, this is equivalent to creating a flag any time there is a 'PB' or 'MB' regardless of what is after it. Is that what you mean?
Hi Miller, I am sorry for the confusion. I want to create a flag when there is any value other than 'PB' or 'MB' after 'PB' or 'MB' across columns.
So in your example data, please tell us for each row whether we should set a flag, or not.
Yes, we should set a flag is there are values other than 'PB' or 'MB' after 'PB' or 'MB' as flag = 1, else flag = 0
please tell us for each row in your example data whether we should set a flag, or not.
Yes, we should.
If I understand your question you could do something like the following:
data one;
input col1 $ col2 $ col3 $ col4 $ col5 $ col6 $ col7 $ col8 $ col9 $ col10 $;
cards;
PM MM JM MM PM PB . . JM .
PM MM JM MM PM PB JM . . .
PM MM JM MM PM MB PM JM . .
PM MM JM MM PM PM MM MB . PB
PM MM JM MM PM PM MM DB . AB
;
data new;
set one;
flag=0;
array list(*) col1-col10;
do i=1 to dim(list);
if list(i) IN ('PB','MB') then do;
do j=i to dim(list);
if list(j) ne ' ' then flag=1;
end;
end;
end;
drop i J;
run;
proc print;
run;
Thank you for the solution. This logic is only flagging records that end with 'PM' or 'MB'
@SP01 wrote:
Thank you for the solution. This logic is only flagging records that end with 'PM' or 'MB'
Which is why @PaigeMiller asked for clarification.
Your initial requirement starts "I want to flag all records which end with either 'PB' or 'MB' value and there are no values after them" which matches the solution.
Poorly phrased if the actual requirement "is flag if any value is 'PB' or 'MB' "
I agree with this. Going forward, I will try to be more precise.
So you want to flag the observations where the last-non-missing column is not 'PB' og 'MB'?
I think this does it:
data want;
set have;
flag=scan(catx(' ',of col1-col10),-1) not in('PB','MB');
run;
data one;
input col1 $ col2 $ col3 $ col4 $ col5 $ col6 $ col7 $ col8 $ col9 $ col10 $;
cards;
PM MM JM MM PM PB . . JM .
PM MM JM MM PM PB JM . . .
PM MM JM MM PM MB PM JM . .
PM MM JM MM PM PM MM MB . PB
PM MM JM MM PM PM MM DB . AB
;
data new;
set one;
length temp $ 2000;
temp=catx('|',of col:);
if prxmatch('/(JM|PB)$/i',strip(temp)) then flag=1;
else flag=0;
drop temp;
run;
proc print;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.