I want to identify the duplicate %age as per month Starting from Jan until Dec.
I could find the duplicates and put in separate data set by the following code:
proc sort data = pankaj.UM_yield_detail_mm
dupout= Duplicate_NODUPS
nodupkey ;
by batch_id ;
run ;Basically, I want to have duplicate count and %age of duplicates based on the original data set which contains Variable "Batch_ID". if in case I don't want to create another other data set. Is it possible or I have to create another data set.
| Batch_ID |
| 8809510976 |
| 8807226488 |
| 8809932795 |
| 8810161136 |
| 8810161136 |
| 8810172306 |
| 8809510752 |
| 8809436613 |
| 8809510734 |
| 8809510734 |
| 8810207111 |
| 8809510734 |
| 8810160901 |
| 8810207111 |
| 8809724861 |
| 8807221194 |
| 8807221200 |
| 8807221205 |
| 8807221217 |
| 8807221217 |
| 8807221217 |
| 8807221217 |
| 8807221200 |
| 8807221217 |
| 8807221200 |
| 8810160901 |
| 8810160901 |
Hi - here's a simple head-start to get a count of each batch_id.
Question - why the aversion to creating an additional data set?
data dsn;
infile cards;
input Batch_ID;
cards;
8809510976
8807226488
8809932795
8810161136
8810161136
8810172306
8809510752
8809436613
8809510734
8809510734
8810207111
8809510734
8810160901
8810207111
8809724861
8807221194
8807221200
8807221205
8807221217
8807221217
8807221217
8807221217
8807221200
8807221217
8807221200
8810160901
8810160901
run;
proc means nway;
class batch_id;
output out=work.stats;
run;
Yes, I didn't create a another data set now. I was trying to not to overwrite the results.
But now, its working and I have the %age of duplicates as well.
proc sql;
create table duplicates_&p2 as
select (count(batch_id)- count(distinct(batch_id)))/count(batch_id)
as Duplicate format percent8.2,
month(datepart(Ptime_build)) as Month,
year(datepart(Ptime_Build)) as Year
from TABLE NAME
where "&Date_Range_Min."dt <= PTIME_BUILD <="&Date_Range_Max"dt and PTIME_BUILD is not missing
group by Month, Year;
quit;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.