BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Hi Team,

 

I have PAN INDIA customer shopping table which contains online transaction details as well as store details.

I want to create a new variable for category like customer is online_only, Store_only and unclassified (if shopped by both criteria).

 

Sample dataset :

DATA C;
input CARD_NUM $5. STORE_NAME $7.;
CARDS;
12345 ONLINE
23456 STR_1
98765 ONLINE
98765 STR_NM
98765 STR_5
43214 STR_A
;
Run;

 

Desired output :

12345 ONLINE_ONLY
23456 STORE_ONLY
98765 UNCLASSIFIED
43214 STORE_ONLY
 

 

kindly note : For online transaction, i have only one indicator as 'ONLINE' but for store visit we have n number of store names.

 

Looking for your kind support.

 

Regards,

Uma Shanker Saini

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @umashankersaini,

 

I think you don't need a CASE WHEN statement here. Just use the MAX and CHOOSEC functions:

proc sql;
create table want as
select card_num, choosec(max(store_name='ONLINE')+2*max(store_name ne 'ONLINE'),
                         'ONLINE_ONLY', 'STORE_ONLY', 'UNCLASSIFIED')
                 as categ length=12
from c
group by card_num;
quit;

And, yes, as

@Reeza wrote:

You need to explain the rules of how STORE_NAME will map to your new categories.

 

My code is based on the assumption that anything other than 'ONLINE' (what about spelling variants like 'Online'? Missing values?) indicates "Store". Obviously, you would need to modify the conditions about STORE_NAME if this assumption does not hold in your real data.

 

If you need to preserve the order of CARD_NUM values from dataset C, I'd rather recommend a DATA step solution, e.g.

data want(drop=_: store_name);
do until(last.card_num);
  set c;
  by card_num notsorted;
  if store_name='ONLINE' then _o=1;
  else _s=1;
end;
length categ $12;
categ=choosec(sum(_o,_s,_s), 'ONLINE_ONLY', 'STORE_ONLY', 'UNCLASSIFIED');
run;

Same assumption as above plus the data must be grouped by CARD_NUM.

 

View solution in original post

9 REPLIES 9
tomrvincent
Rhodochrosite | Level 12
calc min and max store name (just 1st 3 letters) per card.
if the min and max are the same, use it for output otherwise 'unclassified'.
Reeza
Super User

You need to explain the rules of how STORE_NAME will map to your new categories.

 


@umashankersaini wrote:

Hi Team,

 

I have PAN INDIA customer shopping table which contains online transaction details as well as store details.

I want to create a new variable for category like customer is online_only, Store_only and unclassified (if shopped by both criteria).

 

Sample dataset :

DATA C;
input CARD_NUM $5. STORE_NAME $7.;
CARDS;
12345 ONLINE
23456 STR_1
98765 ONLINE
98765 STR_NM
98765 STR_5
43214 STR_A
;
Run;

 

Desired output :

12345 ONLINE_ONLY
23456 STORE_ONLY
98765 UNCLASSIFIED
43214 STORE_ONLY
 

 

kindly note : For online transaction, i have only one indicator as 'ONLINE' but for store visit we have n number of store names.

 

Looking for your kind support.

 

Regards,

Uma Shanker Saini


 

 

umashankersaini
Quartz | Level 8

1. If customer is buying product online - then online only

2. If customer is buying by going to physical store - then store only

3. if customer is buying by both ways (online and store visit ) - Then unclassified

 

for online purchase - store_name would be same like 'ONLINE' across the globe

for physical store - Store_name would be any thing without pattern.

FreelanceReinh
Jade | Level 19

Hi @umashankersaini,

 

I think you don't need a CASE WHEN statement here. Just use the MAX and CHOOSEC functions:

proc sql;
create table want as
select card_num, choosec(max(store_name='ONLINE')+2*max(store_name ne 'ONLINE'),
                         'ONLINE_ONLY', 'STORE_ONLY', 'UNCLASSIFIED')
                 as categ length=12
from c
group by card_num;
quit;

And, yes, as

@Reeza wrote:

You need to explain the rules of how STORE_NAME will map to your new categories.

 

My code is based on the assumption that anything other than 'ONLINE' (what about spelling variants like 'Online'? Missing values?) indicates "Store". Obviously, you would need to modify the conditions about STORE_NAME if this assumption does not hold in your real data.

 

If you need to preserve the order of CARD_NUM values from dataset C, I'd rather recommend a DATA step solution, e.g.

data want(drop=_: store_name);
do until(last.card_num);
  set c;
  by card_num notsorted;
  if store_name='ONLINE' then _o=1;
  else _s=1;
end;
length categ $12;
categ=choosec(sum(_o,_s,_s), 'ONLINE_ONLY', 'STORE_ONLY', 'UNCLASSIFIED');
run;

Same assumption as above plus the data must be grouped by CARD_NUM.

 

ballardw
Super User

This works for the provided example data.

 

proc format library=work;
value $store_online
'ONLINE' = 'ONLINE'
other    = 'STORE'
;

DATA C;
input CARD_NUM $5. STORE_NAME $7.;
CARDS;
12345 ONLINE
23456 STR_1
98765 ONLINE
98765 STR_NM
98765 STR_5
43214 STR_A
;
Run;

proc freq data=c noprint;
   tables card_num*store_name/out=cardcount;
   format store_name $store_online.;
run;

proc sql;
   create table want as
   select distinct card_num,
      case when count(*)>1 then 'UNCLASSIFIED'
           When count(*)=1 then put(store_name,$store_online.)
     end as status
   from cardcount
   group by card_num
   ;
quit;

The Format plus Proc freq reduces the possible values of store to 2.Formatted values will be used to create groups in the proc. If the summary output from proc freq for a card has more than one row (count(*)>1) then it should come from combination of store and online.

I didn't make the format use the longer text you show. That is left as an exercise for the interested reader.

anushreebiotech
Obsidian | Level 7

Hi,

 

Please find the code below:

 

proc sql;
Create table want as select distinct Card_Num ,
case when (count(Store_Name) = 1 and lower(Store_Name) like 'str%') then 'Store_Only'
when (count(Store_Name) = 1 and lower(Store_name) ='online') then 'Online'
when (count(Store_Name) >= 2) then 'Unclassified' end as Category
from C
group by Card_Num
Order by Card_Num
;
quit;

 

Output would be :

CARD_NUM Category

12345Online
23456Store_Only
43214Store_Only
98765Unclassified

 

Regards,

Anushree

r_behata
Barite | Level 11
Proc SQl;
create table want as 

	select distinct CARD_NUM, 
	case 
		when 	count( distinct scan(STORE_NAME,1,'_')) > 1
		 then "UNCLASSIFIED"
		 else	cats(scan(STORE_NAME,1,'_'),'_',"ONLY")
	end as class
	from c
	group by CARD_NUM;
Quit;
umashankersaini
Quartz | Level 8

Hi Team, 

Thanks for your quick reply.

 

There is no pattern for store_name, if offline (store visit). we have pattern for online purchase only.

 

I tried with :

 

proc sql;
Create table want (compress=yes) as
Select distinct CARD_NUM
, case when (STORE_NAME) like '%ONLINE%' then 'ONLINE ONLY'
when STORE_NAME not like '%ONLINE%' then 'STORE ONLY'
when STORE_NAME like '%ONLINE%' and STORE_NAME not like '%ONLINE%' then 'UNCLASSIFIED'
End as CHANNEL
From A
Group by 1;
quit;

 

 

This code is not working fine for unclassified category

 

FreelanceReinh
Jade | Level 19

@umashankersaini: Thanks for clarifying the criteria. Have you tried to implement these in my suggested solution? If not, please see below:

proc sql;
create table want as
select card_num, choosec(max(store_name like '%ONLINE%')+2*max(store_name not like '%ONLINE%'),
                         'ONLINE_ONLY', 'STORE_ONLY', 'UNCLASSIFIED')
                 as channel length=12
from c
group by card_num;
quit;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1720 views
  • 0 likes
  • 7 in conversation