Solved: Re: counting obs. that have maximum duplicate records

jimksas · Posted 03-10-2014 07:14 PM

Hi Friends - i have one SAS dataset with around 1/2 millions of records with some duplicate records on field SKU. I want to know which SKU has maximum duplicate record itself and which has minimum duplicate records for same field, SKU?

Can someone please tell me how can i do this?

Thanks!

stataddict · Posted 03-11-2014 03:01 AM

proc sql;

create table max_min as

select SKU, count(SKU) as duplicates

from your_table

group by SKU

having duplicates > 1

order by duplicates desc

;

quit;

You will get only duplicate values, unique ones will not show up.

First record will be SKU with maximum duplicates.

View solution in original post

stataddict · Posted 03-11-2014 03:01 AM

proc sql;

create table max_min as

select SKU, count(SKU) as duplicates

from your_table

group by SKU

having duplicates > 1

order by duplicates desc

;

quit;

You will get only duplicate values, unique ones will not show up.

First record will be SKU with maximum duplicates.

RW9 · Posted 03-11-2014 05:02 AM

Updated slightly, with this you can select any number of min/max values, in this instance I want bottom and top 2 records:

proc sql;

create table WORK.WANT as

select *

from (

select TYPE,

COUNT(TYPE) as COUNT_OF_TYPE

from SASHELP.CARS

group by TYPE

)

order by COUNT_OF_TYPE;

select COUNT(TYPE)

into :NUM_OBS

from WORK.WANT;

quit;

data want;

set want;

if _n_ <= 2 or _n_ > (&NUM_OBS. - 2) then output;

run;

TomKari · Posted 03-11-2014 02:21 PM

The query in the first response is also doable in the "Query Builder" in exactly the same way, and you can then post-process it with additional queries, the SORT task, and the RANK task.

Tom

jimksas · Posted 03-12-2014 01:16 AM

Thanks Stataddict, RW9 and Tom.

Everything works grt...!!!

I also did validate number of observation using PROC REPORT.

counting obs. that have maximum duplicate records

Re: counting obs. that have maximum duplicate records

Re: counting obs. that have maximum duplicate records

Re: counting obs. that have maximum duplicate records

Re: counting obs. that have maximum duplicate records

Re: counting obs. that have maximum duplicate records

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away