I have a set of accts that I'd like to be marked if a criteria is met by other variables.I tried retain but can't quite get it, I wonder if index needs to beused..
This is what I mean
Have:
Acct ST1 ST2
205 01 PC
205 02 AD
206 02 BC
207 01 AD
208 01 PC
208 03 AD
208 04 AD
209 02 PC
209 03 BC
Want:
I need all the Acct series to be marked that has a 02 in the ST and PC in ST2.
Acct ST1 ST2 USE
205 01 PC N
205 02 AD N
206 02 BC Y
207 01 AD Y
208 01 PC Y
208 03 AD Y
208 04 AD Y
209 02 PC N
209 03 BC N
Thanks.
Then just use something like:
data Have;
input (Acct ST1 ST2) ($);
cards;
205 01 PC
205 02 AD
206 02 BC
207 01 AD
208 01 PC
208 03 AD
208 04 AD
209 02 PC
209 03 BC
;
data want (drop=_:);
do until (last.acct);
set have;
by acct;
if st1 eq "02" and
st2 eq "PC" then _test=1;
end;
do until (last.acct);
set have;
by acct;
if _test then use="N";
else use="Y";
output;
end;
run;
Are you sure your "want" is correct? From it, I don't understand why you assign the values you've assigned.
Yes, I have a large list of accounts (many duplicates) that I'd like to mark as N if they have ST1 = 02 and ST2 = PC.. but mark by their corresponding acct... I want to eventually create a rate from the overall population where USE = Y and the PC's and 02 are not to be used in the calculation.
I find boolean flags easier to create and work with.
PROC SQL ;
create table want as select *,max( st='02' ) as use
from have
group by acct
;
quit;
Or a data step using DOW loops
data want ;
do until (last.acct) ;
set have;
by acct;
use = max(use,st='02');
end;
use = max(0,use);
do until (last.acct);
set have;
by acct ;
end;
run;
Or merge it back with itself.
data want ;
merge have have(rename=(st=st2) where=(st2='02') );
by acct ;
if first.acct then use=(st2='02');
drop st2;
run;
I think that the following does what you want:
data want (drop=_:);
do until (last.acct);
set have;
by acct;
if st1 eq "02" then _02=1;
if st2 eq "PC" then _PC=1;
end;
do until (last.acct);
set have;
by acct;
if _02 and _PC then use="N";
else use="Y";
output;
end;
run;
Thanks this is extremely helpful..
The only thing different I'd add is that these 2 statements should be used collectively and joined wiht an 'AND' .. right now the results I'm gettin gis individually either a 02 or a PC
if st1 eq "02" then _02=1;
if st2 eq "PC" then _PC=1;
The code I suggested provides the result that you showed in your example, i.e., if any acct had both a value of st2="PC" in at least one if its records and a value of st1="02" in at least one of its records it was assigned a value of "N", otherwise it was assigned a value of "Y".
You are correct.. my mistake. I just noticed I wanted: if it has ST1=02 and ST2=PC
Acct ST1 ST2 USE
205 01 PC Y
205 02 AD Y
206 02 BC Y
207 01 AD Y
208 01 PC Y
208 03 AD Y
208 04 AD Y
209 02 PC N
209 03 BC N
Then just use something like:
data Have;
input (Acct ST1 ST2) ($);
cards;
205 01 PC
205 02 AD
206 02 BC
207 01 AD
208 01 PC
208 03 AD
208 04 AD
209 02 PC
209 03 BC
;
data want (drop=_:);
do until (last.acct);
set have;
by acct;
if st1 eq "02" and
st2 eq "PC" then _test=1;
end;
do until (last.acct);
set have;
by acct;
if _test then use="N";
else use="Y";
output;
end;
run;
Thank you. I'm not even gonna try to figure out how you guys come up with these correct responses so quickly.. It's impressive.
Hi ... here's another idea ...
data want;
use = 'Y';
do until(last.acct);
set have (in=one) have;
by acct;
if one and st1 eq '02' and st2 eq 'PC' then use = 'N';
if ^one then output;
end;
run;
Mike, Definitely an interesting approach. You know, of course, that I had to see which method ran quicker.
The approach you suggested took 25% more cpu time to run (and, yes, I would also have posted it if it went the results went the other way). In fact, I was expecting your approach to run more quickly.
Hi ... OK, and I'm sure you tried it on a larger data set ... here's another,
a variation on Tom's MERGE ...
data want;
merge have(in=one where=(st1 eq '02' and st2 eq 'PC')) have;
by acct;
use = ifc(one, 'N', 'Y');
run;
ps the self-interleave would be a bit faster if I had added an ELSE ...
if one and st1 eq '02' and st2 eq 'PC' then use = 'N';
else
if ^one then output;
pps for further reading ... Howard Schreier's "Interleaving a Dataset with Itself: How and Why"
How about;
data Have; input (Acct ST1 ST2) ($); cards; 205 01 PC 205 02 AD 206 02 BC 207 01 AD 208 01 PC 208 03 AD 208 04 AD 209 02 PC 209 03 BC ; run; proc sql; create table want(drop=a b) as select *,sum(st1='02') as a ,sum(st2='PC') as b, case when calculated a ge 1 and calculated b ge 1 then 'N' else 'Y' end as use from have group by acct;quit;
Ksharp
Message was edited by: xia keshan
Very well done, Ksharp
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.