Hello Everyone , so , I have This Data Set :
User | Orders | Fam |
2563 | 1542 | 66 |
2563 | 2251 | 66 |
2563 | 1544 | 66 |
2563 | 2222 | 28 |
2563 | 2335 | 28 |
And What I want to get is The Distinct Number of Fam Per User :
So This My Code :
proc sql;
create Table Max_Fam as
select A*. count(distinct(Fam)) as Max_N_Fam
from Base_For_Counts
group by User
;
quit;
And This is What I get :
User | Orders | Fam | Num_Fam |
2563 | 1542 | 66 | 1 |
2563 | 2251 | 66 | 1 |
2563 | 1544 | 66 | 1 |
2563 | 2222 | 28 | 1 |
2563 | 2335 | 28 | 1 |
And What I want Is This Output :
User | Orders | Fam | Num_Fam |
2563 | 1542 | 66 | 1 |
2563 | 2251 | 66 | |
2563 | 1544 | 66 | |
2563 | 2222 | 28 | 1 |
2563 | 2335 | 28 |
Any Idea to get there ?
Any Help would be much appreciated , thank you
Hi @Midi wouldn't the count(distinct Fam) for user=2563 be 2?
Exactly , In this Case Yes.
So would have been nicer to post a correct sample to help folks not get confused
data have;
input User Orders Fam;
cards;
2563 1542 66
2563 2251 66
2563 1544 66
2563 2222 28
2563 2335 28
;
proc sql;
create Table Max_Fam(drop=_user) as
select *,user as _user, count(distinct(Fam)) as Max_N_Fam
from have
group by _User
;
quit;
Thank You , but i already have this output , what i want is for the cells in common to be joined and having the max that "covers" the three of them , basically not have the max for each cell , i don't know if that was clear , thank's again
data have;
input id Orders Fam;
cards;
2563 1542 66
2563 2251 66
2563 1544 66
2563 2222 28
2563 2335 28
;
data temp;
set have;
rownum+1;
run;
proc sql;
create table want(drop=c rownum) as
select *,ifn(min(rownum)=rownum,c,.) as Max_N_Fam
from
(select *,count(distinct fam) as c
from temp
group by id)
group by id,fam
order by id,rownum;
quit;
Use a data step:
data have;
input User :$4. Orders Fam;
datalines;
2563 1542 66
2563 2251 66
2563 1544 66
2563 2222 28
2563 2335 28
;
data want;
count = 0;
do until (last.user);
set have;
by user fam notsorted;
if first.fam then count + 1;
end;
do until (last.user);
set have;
by user fam notsorted;
if first.fam
then num_fam = count;
else num_fam = .;
output;
end;
drop count;
run;
proc print data=want noobs;
run;
Result:
User Orders Fam num_fam 2563 1542 66 2 2563 2251 66 . 2563 1544 66 . 2563 2222 28 2 2563 2335 28 .
SAS is good enough to get what you want in a single step, even if the input data are completely unsorted. Note that below, a few records have been added to the sample input to make it more representative with respect to more than one user and different summary counts per user.
data have ;
input User Orders Fam ;
cards ;
2500 1000 55
2500 2000 55
2500 3000 55
2563 1542 66
2563 2251 66
2563 1544 66
2563 2222 28
2563 2335 28
;
run ;
data want ;
if _n_ = 1 then do ;
dcl hash h () ;
h.definekey ("user") ;
h.definedata ("num_fam", "_n_") ;
h.definedone () ;
dcl hash u () ;
u.definekey ("user", "fam") ;
u.definedone () ;
do until (z) ;
set have end = z ;
if h.find() ne 0 then num_fam = 0 ;
if u.check() ne 0 then do ;
num_fam + 1 ;
u.add() ;
end ;
h.replace() ;
end ;
end ;
set have ;
h.find() ;
if _n_ = 0 then call missing (num_fam) ;
else h.replace (key:user, data:num_fam, data:0) ;
run ;
At first glance, it may look kind of complex but in actuality it's pretty simple:
Of course, if your input data are already sorted by [User,Fam] (or just grouped, as in your sample), it's still much simpler, as @Kurt_Bremser has already indicated (below, his double DoW-loop program is merely presented in a slightly different guise):
data want ;
do _n_ = 1 by 1 until (last.user) ;
set have ;
by user fam notsorted ;
Num_Fam = sum (num_fam, first.fam) ;
end ;
do _n_ = 1 to _n_ ;
set have ;
if _n_ > 1 then call missing (num_fam) ;
output ;
end ;
run ;
Kind regards
Paul D.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.