HI All,
I have a dataset which has sold items in seperate lines against the same order number. I want to write a script which will look for "bundles" or groups of items and then create a variable with a 1 in it.
For example
got
Order no.-Item -Bundle A
12312 01
12312 02
In the above I would want bundle A to be 1 based on the criteria that an order contains a certain combination of items.
I know there is a way to transpose the data but I'm sure you can loop the dataset instead?
Any ideas?
Thanks guys
I have two suggestions, one with a proc sql and a subquery, the other with proc freq and a datastep merge back:
data have;
infile datalines dlm=',';
input orderid :$5. product :$20.;
datalines;
1234,1
1234,2
1232,5
;
run;
proc sql;
create table want as
select
a.*,
case
when (select count(distinct b.product) from have b where b.orderid = a.orderid) > 1
then '1'
else '0'
end as bundle
from have a
;
quit;
proc freq data=have noprint;
tables orderid /out=want1 (keep=orderid count);
run;
proc sort data=have;
by orderid;
run;
data want2;
merge
have
want1
;
by orderid;
if count > 1
then bundle = '1';
else bundle = '0';
bundle_num = count > 1;
drop count;
run;
Unsing a numeric flag variable makes later use in conditions easier (0 or . = false, everything else = true)
Please post your example data in a way that makes it recognizable as a SAS dataset.
no.-Item and -Bundle both are invalid as SAS names.
So take the macro from https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to convert your dataset to a datastep and post that here according to https://communities.sas.com/t5/help/faqpage/faq-category-id/posting?nobounce, "How can I add SAS syntax to a post? "
Hi Kurt,
Thanks for your response. Here's the code of a sample dataset:
data have;
input orderid $5. product $20. bundle $2.;
datalines;
1234,1
1234,2
1232,5
;
run;
data want;
input orderid $5. product $20. bundle $2.;
datalines;
1234,1,1
1234,2,1
1232,5,0
;
run;
In the above those lines where product = 1 and product = 2 beside the same order number would have bundle = 1?
Does that make sense?
Cheers,
Jordan
I have two suggestions, one with a proc sql and a subquery, the other with proc freq and a datastep merge back:
data have;
infile datalines dlm=',';
input orderid :$5. product :$20.;
datalines;
1234,1
1234,2
1232,5
;
run;
proc sql;
create table want as
select
a.*,
case
when (select count(distinct b.product) from have b where b.orderid = a.orderid) > 1
then '1'
else '0'
end as bundle
from have a
;
quit;
proc freq data=have noprint;
tables orderid /out=want1 (keep=orderid count);
run;
proc sort data=have;
by orderid;
run;
data want2;
merge
have
want1
;
by orderid;
if count > 1
then bundle = '1';
else bundle = '0';
bundle_num = count > 1;
drop count;
run;
Unsing a numeric flag variable makes later use in conditions easier (0 or . = false, everything else = true)
Hi Kurt,
Some great examples there. I'm afraid I'm going to throw in a curve ball; what if my product ID variable is character so we can't work with the numerical logic. For example
data have;
input orderid $5. product $20. bundle $2.;
datalines;
1234 1-01
1233 2-02
1232 5
;
run;
data want;
input orderid $5. product $20. bundle $2.;
datalines;
1234 1-01 1
1233 2-02 1
1232 5-01 0
;
run;
Look at the codes again. product was already character, and both methods make no assumptions for data types.
So it does; All Hail Kurt!
Thanks for your expert help, incredibly appreciated
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.