HI All,
I have a dataset which has sold items in seperate lines against the same order number. I want to write a script which will look for "bundles" or groups of items and then create a variable with a 1 in it.
For example
got
Order no.-Item -Bundle A
12312 01
12312 02
In the above I would want bundle A to be 1 based on the criteria that an order contains a certain combination of items.
I know there is a way to transpose the data but I'm sure you can loop the dataset instead?
Any ideas?
Thanks guys
I have two suggestions, one with a proc sql and a subquery, the other with proc freq and a datastep merge back:
data have;
infile datalines dlm=',';
input orderid :$5. product :$20.;
datalines;
1234,1
1234,2
1232,5
;
run;
proc sql;
create table want as
select
a.*,
case
when (select count(distinct b.product) from have b where b.orderid = a.orderid) > 1
then '1'
else '0'
end as bundle
from have a
;
quit;
proc freq data=have noprint;
tables orderid /out=want1 (keep=orderid count);
run;
proc sort data=have;
by orderid;
run;
data want2;
merge
have
want1
;
by orderid;
if count > 1
then bundle = '1';
else bundle = '0';
bundle_num = count > 1;
drop count;
run;
Unsing a numeric flag variable makes later use in conditions easier (0 or . = false, everything else = true)
Please post your example data in a way that makes it recognizable as a SAS dataset.
no.-Item and -Bundle both are invalid as SAS names.
So take the macro from https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to convert your dataset to a datastep and post that here according to https://communities.sas.com/t5/help/faqpage/faq-category-id/posting?nobounce, "How can I add SAS syntax to a post? "
Hi Kurt,
Thanks for your response. Here's the code of a sample dataset:
data have;
input orderid $5. product $20. bundle $2.;
datalines;
1234,1
1234,2
1232,5
;
run;
data want;
input orderid $5. product $20. bundle $2.;
datalines;
1234,1,1
1234,2,1
1232,5,0
;
run;
In the above those lines where product = 1 and product = 2 beside the same order number would have bundle = 1?
Does that make sense?
Cheers,
Jordan
I have two suggestions, one with a proc sql and a subquery, the other with proc freq and a datastep merge back:
data have;
infile datalines dlm=',';
input orderid :$5. product :$20.;
datalines;
1234,1
1234,2
1232,5
;
run;
proc sql;
create table want as
select
a.*,
case
when (select count(distinct b.product) from have b where b.orderid = a.orderid) > 1
then '1'
else '0'
end as bundle
from have a
;
quit;
proc freq data=have noprint;
tables orderid /out=want1 (keep=orderid count);
run;
proc sort data=have;
by orderid;
run;
data want2;
merge
have
want1
;
by orderid;
if count > 1
then bundle = '1';
else bundle = '0';
bundle_num = count > 1;
drop count;
run;
Unsing a numeric flag variable makes later use in conditions easier (0 or . = false, everything else = true)
Hi Kurt,
Some great examples there. I'm afraid I'm going to throw in a curve ball; what if my product ID variable is character so we can't work with the numerical logic. For example
data have;
input orderid $5. product $20. bundle $2.;
datalines;
1234 1-01
1233 2-02
1232 5
;
run;
data want;
input orderid $5. product $20. bundle $2.;
datalines;
1234 1-01 1
1233 2-02 1
1232 5-01 0
;
run;
Look at the codes again. product was already character, and both methods make no assumptions for data types.
So it does; All Hail Kurt!
Thanks for your expert help, incredibly appreciated
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.