Re: Count of two separate unique variables between columns to ultimate...

JFranco2 · Posted 02-27-2025 02:22 PM

I would like to see who uses something most. I have a column of the product and I have a column of the user ID - I want to only know who is using it most. The Data Set shows every time the product is used and by whom - I want to use those two columns to calculate one or two new columns (I also have the date) so a column to show the count of the most times used. Any suggestions?

JFranco2 · Posted 02-27-2025 02:29 PM

I know I can do this in a summary table but want to remove the duplicates entries and only keep the highest frequency information

Tom · Posted 02-27-2025 02:48 PM

Perhaps the COUNT() aggregate function of PROC SQL with the DISTINCT keyword to eliminate duplicates?

proc sql;
create table want as
  select groupvar , count(distinct entryvar) as n_entries
  from have
  group by groupvar
  order by n_entries desc 
;
quit;

JFranco2 · Posted 02-27-2025 04:03 PM

Thank you!

PaigeMiller · Posted 02-27-2025 02:30 PM

... to calculate one or two new columns ...

This is pretty vague. What should those one or two new columns contain? Will they be in the same data set or a new data set? How do they relate to measuring highest frequency?

Could we measure highest frequency without creating one or two new columns?

Given your explanation above, it seems like just counting isn't the way you want to get to "highest frequency". Could you elaborate on what you mean by "highest frequency"?

--
Paige Miller

JFranco2 · Posted 02-27-2025 03:13 PM

Actually I think it is just one column.

Column: would have the occurrence ascending by date order. So for instance if I used two vehicles 101 times over the course of a year the new column (using vehicle number, my personal ID number and date columns) would show numbers going from 1-what ever the count is for each vehicle.

PaigeMiller · Posted 02-27-2025 03:22 PM

Maybe this?

proc sort data=have;
    by vehicle_number ID date;
run;

data want;
    set have;
    by vehicle_number id;
    if first.id then count=0;
    count+1;
run;

--
Paige Miller

JFranco2 · Posted 02-27-2025 04:02 PM

thank you!!!

Count of two separate unique variables between columns to ultimately gain the highest frequency

Re: Count of two separate unique variables between columns to ultimately gain the highest frequency

Re: Count of two separate unique variables between columns to ultimately gain the highest frequency

Re: Count of two separate unique variables between columns to ultimately gain the highest frequency

Re: Count of two separate unique variables between columns to ultimately gain the highest frequency

Re: Count of two separate unique variables between columns to ultimately gain the highest frequency

Re: Count of two separate unique variables between columns to ultimately gain the highest frequency

Re: Count of two separate unique variables between columns to ultimately gain the highest frequency

Registration is open

SAS Training: Just a Click Away