Hi.,
I have a dataset people with cust_id and landline_no variables
cust_id landline
101 0802471852
102 0405214752
101 0226510020
103 0446125420
i have another dataset state with cust_id and state_code
city std_code
bang 080
hyd 040
mum 022
chn 044
i need the count of customer with respect to std_code as result
std_code cust_cnt
080 2
- -
- -
thanks in advance
Hi,
Not sure about your title (Dataset solution) does that mean it needs to be only using datasteps? If so then you need to do each part separately and merge them back together. Alternatively you could do:
data have;
cust_id="101"; landline="0802471852"; output;
cust_id="106"; landline="0802471852"; output;
cust_id="102"; landline="0405214752"; output;
cust_id="101"; landline="0226510020"; output;
cust_id="103"; landline="0446125420"; output;
run;
data code;
city="bang"; std_code="080"; output;
city="hyd"; std_code="040"; output;
city="mum"; std_code="022"; output;
run;
proc sql;
create table WANT as
select A.CUST_ID,
A.LANDLINE,
B.STD_CODE,
C.COUNT
from HAVE A
left join CODE B
on substr(A.LANDLINE,1,3)=B.STD_CODE
left join (select substr(LANDLINE,1,3) as CDE,
count(1) as COUNT
from HAVE
group by CDE) C
on substr(A.LANDLINE,1,3)=C.CDE;
quit;
is it possible to do using merge
Yes, its just long winded. The logic above applies. Just do each bit separately. E.g. sort have, datastep to create 3 char std_code. Sort codes, then datastep merge that and codes by std_code. With that dataset, sort by std_code, then in a datastep have a retain statement for cont, or do a proc means. The means output N can then be merged back on.
Is std_code equivalent with the first three character of landline ?
data have; cust_id="101"; landline="0802471852"; output; cust_id="106"; landline="0802471852"; output; cust_id="102"; landline="0405214752"; output; cust_id="101"; landline="0226510020"; output; cust_id="103"; landline="0446125420"; output; run; data code; city="bang"; std_code="080"; output; city="hyd"; std_code="040"; output; city="mum"; std_code="022"; output; run; proc freq noprint data=have(rename=(landline=std_code)); tables std_code/out=want(drop=percent) norow nocol nopercent nocum; format std_code $3.; run;
Xia Keshan
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.