Hi there,
I am wondering what the best way is to get this kind of analysis?
Lets assume if my data looks like:
ClientID | Transaction
001 | A1
001 | A1
001 | A2
001 | A3
002 | A4
002 | A5
003 | A6
003 | A6
Result would be:
Client 001 has 3 unique transactions
Client 002 has 2 unique transactions
Client 003 has 1 unique transactions
Thanks!
proc sql;
create table want as
select client_id,count(distinct transactions) as tot_trans
from have
group by client_id
order by client_id;
quit;
If your data is very big then the above code will take much time as i have included count(distinct transactions)...so SQL processor has to make two passes through the data in order to eliminate the duplicates...
In that case, just try with following one...
proc sql;
create table want as
select distinct client_id, transactions
from have;
quit;
proc freq data = want noprint;
table client_id / out = want(drop = percent);
run;
Choice is yours based on the size of data...
-Urvish
data a;
input id $ Trans $;
cards;
...
Proc SQL ;
Select id, count(Distinct(Trans)) as UnqTrans from A Group by ID; Quit;
Hi Vomer,
This is another process by Proc Freq through sorting
| Data want; | |
| input id trans $; | |
| datalines; | |
| 001 A1 | |
| 001 A1 | |
| 001 A2 | |
| 001 A3 | |
| 002 A4 | |
| 002 A5 | |
| 003 A6 | |
| 003 A6 | |
| ; | |
| proc sort data=want nodup; | |
| By id trans; | |
| run; | |
| Proc freq data=want; | |
table id / nopct nocum; Run; | |
run;
proc sql;
create table want as
select client_id,count(distinct transactions) as tot_trans
from have
group by client_id
order by client_id;
quit;
If your data is very big then the above code will take much time as i have included count(distinct transactions)...so SQL processor has to make two passes through the data in order to eliminate the duplicates...
In that case, just try with following one...
proc sql;
create table want as
select distinct client_id, transactions
from have;
quit;
proc freq data = want noprint;
table client_id / out = want(drop = percent);
run;
Choice is yours based on the size of data...
-Urvish
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.