Hello everyone,
I have a scenario where I need to find whether the same "reward_program_id1" OR "reward_program_id2" is used in multiple product_id. In the below data product_id is unique however, reward_program_id1 & reward_program_id2 digits can be used interchangeable to purchase product.
For example, product_id 89067 is purchased using reward_program_id2 = 567694(row #1) and the same digits are used as reward_program_id1 to purchase product_id 90873 (row #4). Likewise, product_id 34340 is purchased using reward_program_id1 = 429008(row #3) and the same digits are used as reward_program_id2 to purchase product_id 10983(row #8).
In the output I am trying to add a column "Multi_program_id" Y/N if the same reward ID is used in more than one product. I hope this makes sense, any help is appreciated!
Thanks in advance.
Data have;
input product_id product_delivered $ reward_program_id1 reward_program_id2 ;
infile datalines delimiter=',';
datalines;
89067,Y,223456,567694
58394,N,343412,
34340,Y,429008,412948
90873,N,567694,293832
91028,N,389292,902221
38034,N,459303,343033
76281,N,828292,282929
10983,Y,909033,429008
18738,N,929292,
17930,Y,343412,982932
29383,N,898903,
11283,Y,976409,829737
98203,N,839234,928302
12930,N,918833,303033
78023,Y,123233,412948
;
run;
Output:
Product_id | product_delivered | reward_program_id1 | reward_program_id2 | Multi_program_id |
89067 | Y | 223456 | 567694 | Y |
58394 | N | 343412 | . | Y |
34340 | Y | 429008 | 412948 | Y |
90873 | N | 567694 | 293832 | Y |
91028 | N | 389292 | 902221 | N |
38034 | N | 459303 | 343033 | N |
76281 | N | 828292 | 282929 | N |
10983 | Y | 909033 | 429008 | Y |
18738 | N | 929292 | . | N |
17930 | Y | 343412 | 982932 | Y |
29383 | N | 898903 | . | N |
11283 | Y | 976409 | 829737 | N |
98203 | N | 839234 | 928302 | N |
12930 | N | 918833 | 303033 | N |
78023 | Y | 123233 | 412948 | Y |
Hi @vicky07
This an interesting problem, because the caunt of same reward-id on a product is taken from one or other both of the columns reward_program_id1 and reward_program_id2. So it ended up a bit more complicated than I thought from start. But I think it works.
Data have;
input product_id product_delivered $ reward_program_id1 reward_program_id2 ;
infile datalines delimiter=',';
datalines;
89067,Y,223456,567694
58394,N,343412,
34340,Y,429008,412948
90873,N,567694,293832
91028,N,389292,902221
38034,N,459303,343033
76281,N,828292,282929
10983,Y,909033,429008
18738,N,929292,
17930,Y,343412,982932
29383,N,898903,
11283,Y,976409,829737
98203,N,839234,928302
12930,N,918833,303033
78023,Y,123233,412948
;
run;
* Collapse reward_program_id1 and reward_program_id2 til one column without misings;
* order on reward_program_id to use in following data step;
proc sql;
create table rewardproduct as
select distinct
product_id,
reward_program_id
from (
select
product_id,
reward_program_id1 as reward_program_id
from have
where reward_program_id1 ne .
union
select
product_id,
reward_program_id2 as reward_program_id
from have
where reward_program_id2 ne .
)
order by
reward_program_id,
product_id;
quit;
* Count products per program_id and set code for multiple products pr. reward_program_id;
data multi;
set rewardproduct;
by reward_program_id;
if (first.reward_program_id and last.reward_program_id) then multi = 0;
else multi = 1;
run;
* Set highest value of products pr. reward_program_id as multivalue on product;
proc sql;
create table multi2 as
select distinct
product_id,
ifc(max(multi)=1,'Y','N') as Multi_Program_id
from multi
group by product_id;
quit;
* Join to get final result;
proc sql;
create table want as
select distinct
a.product_id,
a.product_delivered,
a.reward_program_id1,
a.reward_program_id2,
b.Multi_Program_id
from have as a
left join multi2 as b
on a.product_id = b.product_id;
quit;
Hi @vicky07
This an interesting problem, because the caunt of same reward-id on a product is taken from one or other both of the columns reward_program_id1 and reward_program_id2. So it ended up a bit more complicated than I thought from start. But I think it works.
Data have;
input product_id product_delivered $ reward_program_id1 reward_program_id2 ;
infile datalines delimiter=',';
datalines;
89067,Y,223456,567694
58394,N,343412,
34340,Y,429008,412948
90873,N,567694,293832
91028,N,389292,902221
38034,N,459303,343033
76281,N,828292,282929
10983,Y,909033,429008
18738,N,929292,
17930,Y,343412,982932
29383,N,898903,
11283,Y,976409,829737
98203,N,839234,928302
12930,N,918833,303033
78023,Y,123233,412948
;
run;
* Collapse reward_program_id1 and reward_program_id2 til one column without misings;
* order on reward_program_id to use in following data step;
proc sql;
create table rewardproduct as
select distinct
product_id,
reward_program_id
from (
select
product_id,
reward_program_id1 as reward_program_id
from have
where reward_program_id1 ne .
union
select
product_id,
reward_program_id2 as reward_program_id
from have
where reward_program_id2 ne .
)
order by
reward_program_id,
product_id;
quit;
* Count products per program_id and set code for multiple products pr. reward_program_id;
data multi;
set rewardproduct;
by reward_program_id;
if (first.reward_program_id and last.reward_program_id) then multi = 0;
else multi = 1;
run;
* Set highest value of products pr. reward_program_id as multivalue on product;
proc sql;
create table multi2 as
select distinct
product_id,
ifc(max(multi)=1,'Y','N') as Multi_Program_id
from multi
group by product_id;
quit;
* Join to get final result;
proc sql;
create table want as
select distinct
a.product_id,
a.product_delivered,
a.reward_program_id1,
a.reward_program_id2,
b.Multi_Program_id
from have as a
left join multi2 as b
on a.product_id = b.product_id;
quit;
I was going to post a similar approach. Is there another way to make the multi flag other than first and last variables? I didn't post my solution because I was not finished but I wanted to get a count of reward by product_id and then if it was ge 2 then it was in more than one product. Would something close to that work?
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.