Solved: Need help with coding

vicky07 · Posted 06-15-2021 10:21 AM

Hello everyone,

I have a scenario where I need to find whether the same "reward_program_id1" OR "reward_program_id2" is used in multiple product_id. In the below data product_id is unique however, reward_program_id1 & reward_program_id2 digits can be used interchangeable to purchase product.

For example, product_id 89067 is purchased using reward_program_id2 = 567694(row #1) and the same digits are used as reward_program_id1 to purchase product_id 90873 (row #4). Likewise, product_id 34340 is purchased using reward_program_id1 = 429008(row #3) and the same digits are used as reward_program_id2 to purchase product_id 10983(row #8).

In the output I am trying to add a column "Multi_program_id" Y/N if the same reward ID is used in more than one product. I hope this makes sense, any help is appreciated!

Thanks in advance.

Data have;
input product_id product_delivered $ reward_program_id1 reward_program_id2 ;
infile datalines delimiter=',';
datalines;
89067,Y,223456,567694
58394,N,343412,
34340,Y,429008,412948
90873,N,567694,293832
91028,N,389292,902221
38034,N,459303,343033
76281,N,828292,282929
10983,Y,909033,429008
18738,N,929292,
17930,Y,343412,982932
29383,N,898903,
11283,Y,976409,829737
98203,N,839234,928302
12930,N,918833,303033
78023,Y,123233,412948
;
run;

Output:

Product_id	product_delivered	reward_program_id1	reward_program_id2	Multi_program_id
89067	Y	223456	567694	Y
58394	N	343412	.	Y
34340	Y	429008	412948	Y
90873	N	567694	293832	Y
91028	N	389292	902221	N
38034	N	459303	343033	N
76281	N	828292	282929	N
10983	Y	909033	429008	Y
18738	N	929292	.	N
17930	Y	343412	982932	Y
29383	N	898903	.	N
11283	Y	976409	829737	N
98203	N	839234	928302	N
12930	N	918833	303033	N
78023	Y	123233	412948	Y

ErikLund_Jensen · Posted 06-15-2021 01:04 PM

Hi @vicky07

This an interesting problem, because the caunt of same reward-id on a product is taken from one or other both of the columns reward_program_id1 and reward_program_id2. So it ended up a bit more complicated than I thought from start. But I think it works.

Data have;
input product_id product_delivered $ reward_program_id1 reward_program_id2 ;
infile datalines delimiter=',';
datalines;
89067,Y,223456,567694
58394,N,343412,
34340,Y,429008,412948
90873,N,567694,293832
91028,N,389292,902221
38034,N,459303,343033
76281,N,828292,282929
10983,Y,909033,429008
18738,N,929292,
17930,Y,343412,982932
29383,N,898903,
11283,Y,976409,829737
98203,N,839234,928302
12930,N,918833,303033
78023,Y,123233,412948
;
run;


* Collapse reward_program_id1 and reward_program_id2 til one column without misings;
* order on reward_program_id to use in following data step;
proc sql;
	create table rewardproduct as
		select distinct 
			product_id, 
			reward_program_id
		from (
		 	select 
				product_id, 
				reward_program_id1 as reward_program_id
			from have
			where reward_program_id1 ne .
			union
	 		select 
				product_id, 
				reward_program_id2 as reward_program_id
			from have
			where reward_program_id2 ne .
		)
		order by 
			reward_program_id, 
			product_id;
quit;

* Count products per program_id and set code for multiple products pr. reward_program_id;
data multi;
	set rewardproduct;
	by reward_program_id;
	if (first.reward_program_id and last.reward_program_id) then multi = 0;
	else multi = 1;
run;

* Set highest value of products pr. reward_program_id as multivalue on product;
proc sql;
	create table multi2 as
		select distinct 
			product_id, 
			ifc(max(multi)=1,'Y','N') as Multi_Program_id
		from multi
		group by product_id;
quit;

* Join to get final result;
proc sql;
	create table want as 
		select distinct 
			a.product_id, 
			a.product_delivered, 
			a.reward_program_id1, 
			a.reward_program_id2, 
			b.Multi_Program_id
		from have as a
		left join multi2 as b
		on a.product_id = b.product_id;
quit;

View solution in original post

ErikLund_Jensen · Posted 06-15-2021 01:04 PM

Hi @vicky07

This an interesting problem, because the caunt of same reward-id on a product is taken from one or other both of the columns reward_program_id1 and reward_program_id2. So it ended up a bit more complicated than I thought from start. But I think it works.

Data have;
input product_id product_delivered $ reward_program_id1 reward_program_id2 ;
infile datalines delimiter=',';
datalines;
89067,Y,223456,567694
58394,N,343412,
34340,Y,429008,412948
90873,N,567694,293832
91028,N,389292,902221
38034,N,459303,343033
76281,N,828292,282929
10983,Y,909033,429008
18738,N,929292,
17930,Y,343412,982932
29383,N,898903,
11283,Y,976409,829737
98203,N,839234,928302
12930,N,918833,303033
78023,Y,123233,412948
;
run;


* Collapse reward_program_id1 and reward_program_id2 til one column without misings;
* order on reward_program_id to use in following data step;
proc sql;
	create table rewardproduct as
		select distinct 
			product_id, 
			reward_program_id
		from (
		 	select 
				product_id, 
				reward_program_id1 as reward_program_id
			from have
			where reward_program_id1 ne .
			union
	 		select 
				product_id, 
				reward_program_id2 as reward_program_id
			from have
			where reward_program_id2 ne .
		)
		order by 
			reward_program_id, 
			product_id;
quit;

* Count products per program_id and set code for multiple products pr. reward_program_id;
data multi;
	set rewardproduct;
	by reward_program_id;
	if (first.reward_program_id and last.reward_program_id) then multi = 0;
	else multi = 1;
run;

* Set highest value of products pr. reward_program_id as multivalue on product;
proc sql;
	create table multi2 as
		select distinct 
			product_id, 
			ifc(max(multi)=1,'Y','N') as Multi_Program_id
		from multi
		group by product_id;
quit;

* Join to get final result;
proc sql;
	create table want as 
		select distinct 
			a.product_id, 
			a.product_delivered, 
			a.reward_program_id1, 
			a.reward_program_id2, 
			b.Multi_Program_id
		from have as a
		left join multi2 as b
		on a.product_id = b.product_id;
quit;

tarheel13 · Posted 06-15-2021 02:58 PM

I was going to post a similar approach. Is there another way to make the multi flag other than first and last variables? I didn't post my solution because I was not finished but I wanted to get a count of reward by product_id and then if it was ge 2 then it was in more than one product. Would something close to that work?

vicky07 · Posted 06-16-2021 12:12 AM

It worked. Thank you very much!!

Need help with coding

Re: Need help with coding

Re: Need help with coding

Re: Need help with coding

Re: Need help with coding

Need help with coding

Re: Need help with coding

Re: Need help with coding

Re: Need help with coding

Re: Need help with coding

Ready to join fellow brilliant minds for the SAS Hackathon?

Click image to register for webinar

Classroom Training Available!