I would like to see who uses something most. I have a column of the product and I have a column of the user ID - I want to only know who is using it most. The Data Set shows every time the product is used and by whom - I want to use those two columns to calculate one or two new columns (I also have the date) so a column to show the count of the most times used. Any suggestions?
I know I can do this in a summary table but want to remove the duplicates entries and only keep the highest frequency information
Perhaps the COUNT() aggregate function of PROC SQL with the DISTINCT keyword to eliminate duplicates?
proc sql;
create table want as
select groupvar , count(distinct entryvar) as n_entries
from have
group by groupvar
order by n_entries desc
;
quit;
Thank you!
... to calculate one or two new columns ...
This is pretty vague. What should those one or two new columns contain? Will they be in the same data set or a new data set? How do they relate to measuring highest frequency?
Could we measure highest frequency without creating one or two new columns?
Given your explanation above, it seems like just counting isn't the way you want to get to "highest frequency". Could you elaborate on what you mean by "highest frequency"?
Actually I think it is just one column.
Column: would have the occurrence ascending by date order. So for instance if I used two vehicles 101 times over the course of a year the new column (using vehicle number, my personal ID number and date columns) would show numbers going from 1-what ever the count is for each vehicle.
Maybe this?
proc sort data=have;
by vehicle_number ID date;
run;
data want;
set have;
by vehicle_number id;
if first.id then count=0;
count+1;
run;
thank you!!!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.