turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- simple datastep

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-20-2014 06:16 PM

hi all,

I need to find all the records where prod_code='b' and only 'b' which means the record will be single with prod_code='b' and then assign a new var value='found' to that record.

prod code=b is always associated with type=a so i don't need to worry about it

data one;

input id prod_type $ prod_code $ ;

cards;

100 a b

101 a b

101 e s

102 n v

102 k l

103 a b

104 a j

105 a b

106 v p

;

the output table should be something like :

id prod_type prod_code value

100 a b found

101 a b

101 e s

102 n v

102 k l

103 a b found

104 a j

105 a b found

106 v p

I can do this by creating a new dataset and merging it back to the original but i was wondering if this can be solved in one step (using maybe the order by.... having count(*)=1,........)

Any thoughts please?

101 has a code=b but also he has a code=s so he does not qualify

Accepted Solutions

Solution

12-20-2014
06:49 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-20-2014 06:49 PM

If I correctly understand your requirements, the following might suffice:

proc sql;

create table want as

select *, case

when min(prod_code) eq 'b' and

max(prod_code) eq 'b' and

count(prod_code eq 'b') eq 1

then 'found'

else ''

end as value

from one

group by id

;

quit;

All Replies

Solution

12-20-2014
06:49 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-20-2014 06:49 PM

If I correctly understand your requirements, the following might suffice:

proc sql;

create table want as

select *, case

when min(prod_code) eq 'b' and

max(prod_code) eq 'b' and

count(prod_code eq 'b') eq 1

then 'found'

else ''

end as value

from one

group by id

;

quit;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-20-2014 06:59 PM

One way to go:

data one;

input id prod_type $ prod_code $ ;

cards;

100 a b

100 a b

101 a b

101 e s

102 n v

102 k l

103 a b

104 a j

105 a b

106 v p

;

run;

proc sql;

create table want as

select *,

case

when count(distinct prod_code)=1 and prod_code='b' then 'found'

else ''

end as value

from one

group by id

;

quit;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Patrick

12-20-2014 08:16 PM

they both work

Thank you guys

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Patrick

12-21-2014 01:30 AM

" which means the record will be **single** with prod_code='b' "

OP said want only single obs, not sure if your code fit his demand .

data one; input id prod_type $ prod_code $ ; cards; 100 a b 101 a b 101 e s 102 n v 102 k l 103 a b 104 a j 105 a b 106 v p ; run; data want; set one; by id; found=ifc( first.id and last.id and prod_code='b' ,'found',' '); run;

Xia Keshan

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

12-21-2014 03:05 PM

hey Xia ,

i forgot to mention that first i'd delete duplicates by id, prod_type, prod_code so Patrick code works too.

Thanks for your code too. Pretty simple

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

12-21-2014 03:11 PM

this is the code i initially used :

proc sort data=one;by id prod_type proc_code nodupkey;run;

data want;

set one;

by id;

if first.id=last.id and prod_code='b' then value='found';

and it did not give me the right output but the it looks right,no?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-21-2014 05:32 PM

Not the way you have it written. In your proc sort nodupkey is misplaced and you've misspelled prod_code.

Plus, your code (once corrected) will assign 'found' when there are actually multiple instances of prod_code eq 'b'.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

12-21-2014 05:45 PM

Sorry I had it as prod_code and nodupkey was where it was supposed to be in my code but I guess the first and last.I'd logic wasn't right as you are saying.

But I thought first.I'd=last.I'd was true when I have a single record by id

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-21-2014 05:54 PM

The question is: when you have multiple instances of prod_type = 'b' should that, or shouldn't that, be marked as found. The approach you used eliminated the ability to check for such multiple instances since they were deleted during the sort process.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

12-21-2014 06:53 PM

I can have 101 a b

101 a b

Then with the sort proc comes down to one and if then

101 has nothing else but a b goes to group found

But if let's say I also had 101 m n then it is a different story( does not qualify). So 101 a b

101 a b is OK

But 102 a b

102 a b

102 n m is not. The client must have only a value of b to be in the group found. I hope I make sense. So all of the codes provided here worked but mine did not

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-21-2014 07:03 PM

Since duplicate values of 'b' are ok, then you could simplify my code to:

proc sql;

create table want as

select *, case

when min(prod_code) eq 'b' and

max(prod_code) eq 'b'

then 'found'

else ''

end as value

from one

group by id

;

quit;

Similarly, your sort and data step would work as follows:

proc sort data=one nodupkey;

by id prod_type prod_code;

run;

data want;

set one;

by id;

if first.id and last.id and prod_code='b' then value='found';

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

12-21-2014 07:17 PM

So first.I'd=last.I'd was not right in my code?

But if its only one rec by id then first and last are always=1 which means they are equal as I had it in my code. Any way thanks for your lessons and the codes sir

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-21-2014 08:27 PM

Your code will fail with data like the following:

data one;

input id prod_type $ prod_code $ ;

cards;

105 a a

105 a b

105 a c

;

Your code will show the second record to be 'found' because first.id=0 and last.id=0 (thus first.id=last.id) and prod_code equals 'b'.

However, your code's logic doesn't match your requirements.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

12-21-2014 08:53 PM

right right i see it now.Thank you!