hi all,
I need to find all the records where prod_code='b' and only 'b' which means the record will be single with prod_code='b' and then assign a new var value='found' to that record.
prod code=b is always associated with type=a so i don't need to worry about it
data one;
input id prod_type $ prod_code $ ;
cards;
100 a b
101 a b
101 e s
102 n v
102 k l
103 a b
104 a j
105 a b
106 v p
;
the output table should be something like :
id prod_type prod_code value
100 a b found
101 a b
101 e s
102 n v
102 k l
103 a b found
104 a j
105 a b found
106 v p
I can do this by creating a new dataset and merging it back to the original but i was wondering if this can be solved in one step (using maybe the order by.... having count(*)=1,........)
Any thoughts please?
101 has a code=b but also he has a code=s so he does not qualify
If I correctly understand your requirements, the following might suffice:
proc sql;
create table want as
select *, case
when min(prod_code) eq 'b' and
max(prod_code) eq 'b' and
count(prod_code eq 'b') eq 1
then 'found'
else ''
end as value
from one
group by id
;
quit;
If I correctly understand your requirements, the following might suffice:
proc sql;
create table want as
select *, case
when min(prod_code) eq 'b' and
max(prod_code) eq 'b' and
count(prod_code eq 'b') eq 1
then 'found'
else ''
end as value
from one
group by id
;
quit;
One way to go:
data one;
input id prod_type $ prod_code $ ;
cards;
100 a b
100 a b
101 a b
101 e s
102 n v
102 k l
103 a b
104 a j
105 a b
106 v p
;
run;
proc sql;
create table want as
select *,
case
when count(distinct prod_code)=1 and prod_code='b' then 'found'
else ''
end as value
from one
group by id
;
quit;
they both work
Thank you guys
" which means the record will be single with prod_code='b' "
OP said want only single obs, not sure if your code fit his demand .
data one; input id prod_type $ prod_code $ ; cards; 100 a b 101 a b 101 e s 102 n v 102 k l 103 a b 104 a j 105 a b 106 v p ; run; data want; set one; by id; found=ifc( first.id and last.id and prod_code='b' ,'found',' '); run;
Xia Keshan
hey Xia ,
i forgot to mention that first i'd delete duplicates by id, prod_type, prod_code so Patrick code works too.
Thanks for your code too. Pretty simple
this is the code i initially used :
proc sort data=one;by id prod_type proc_code nodupkey;run;
data want;
set one;
by id;
if first.id=last.id and prod_code='b' then value='found';
and it did not give me the right output but the it looks right,no?
Not the way you have it written. In your proc sort nodupkey is misplaced and you've misspelled prod_code.
Plus, your code (once corrected) will assign 'found' when there are actually multiple instances of prod_code eq 'b'.
Sorry I had it as prod_code and nodupkey was where it was supposed to be in my code but I guess the first and last.I'd logic wasn't right as you are saying.
But I thought first.I'd=last.I'd was true when I have a single record by id
The question is: when you have multiple instances of prod_type = 'b' should that, or shouldn't that, be marked as found. The approach you used eliminated the ability to check for such multiple instances since they were deleted during the sort process.
I can have 101 a b
101 a b
Then with the sort proc comes down to one and if then
101 has nothing else but a b goes to group found
But if let's say I also had 101 m n then it is a different story( does not qualify). So 101 a b
101 a b is OK
But 102 a b
102 a b
102 n m is not. The client must have only a value of b to be in the group found. I hope I make sense. So all of the codes provided here worked but mine did not
Since duplicate values of 'b' are ok, then you could simplify my code to:
proc sql;
create table want as
select *, case
when min(prod_code) eq 'b' and
max(prod_code) eq 'b'
then 'found'
else ''
end as value
from one
group by id
;
quit;
Similarly, your sort and data step would work as follows:
proc sort data=one nodupkey;
by id prod_type prod_code;
run;
data want;
set one;
by id;
if first.id and last.id and prod_code='b' then value='found';
run;
So first.I'd=last.I'd was not right in my code?
But if its only one rec by id then first and last are always=1 which means they are equal as I had it in my code. Any way thanks for your lessons and the codes sir
Your code will fail with data like the following:
data one;
input id prod_type $ prod_code $ ;
cards;
105 a a
105 a b
105 a c
;
Your code will show the second record to be 'found' because first.id=0 and last.id=0 (thus first.id=last.id) and prod_code equals 'b'.
However, your code's logic doesn't match your requirements.
right right i see it now.Thank you!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.