Hi,
I am looking for help or some guidance to automate my simple code using macro variables and possible the best method a macro. I am doing some basic research and would like to cut out the manual element of my sas code.
The below is all test data I have just created. The actual dataset layout I have is a carbon copy of the below. The way the data is stored for preffered sports is not ideal but what has been set.
I am looking to find the number of people and total value who likes a specific combination of sports. The combination should be based off the table 'interest. E.g from the list, football and basketball, football and darts, darts and basketball. This is then to be used in the code which I have named as 'summary'. Rather than manually entering the different combinations, is there a way for a macro to do this work for me? My actual dataset will contain more than 3 variables to be used for the combination but I have limited this to 3 in the test example below.
Thanks
Jacko
data research;
length name $10 Value 8. Preferred_Sports $100;
infile datalines dsd;
input Name $ Value Preferred_Sports $;
datalines;
Jude,32,"Football,Basketball,Tennis,Darts"
Jack,39,"Hockey,Darts,Judo"
Harry,20,"Basketball,Tennis,Hockey"
Tom,8,"Badminton,Tennis,Darts"
Boris,14,"Badminton,Hockey,Judo"
Pier,18,"Judo"
Michael,12,"Darts,Polo,Tennis"
Roger,49,"Hockey,Darts,Judo"
Felix,1,"Football,Basketball,Tennis,Darts"
;
data interest;
infile datalines dsd;
input Sport $;
datalines;
Football
Basketball
Darts
;
proc sql;
create table summary as
select
count (distinct name) as people,
sum (value) as total_value
from research
where Preferred_Sports contains 'Football'
and Preferred_Sports contains 'Basketball'
;quit;
want:
Football = &Football
Basketball = &Basketball
Darts = &Darts
automation so that the below code will go through the different options above. e.g. where preffered sports contains football and preffered sport contain Basketball, where preffered sports contains football and preffered sport contain darts, where preffered sports contains basketball and preffered sport contain darts etc
proc sql;
create table summary as
select
count (distinct name) as people,
sum (value) as total_value
from research
where Preferred_Sports contains '&Football'
and Preferred_Sports contains '&Basketball'
;quit;
Hi @JackoNewbie
You can create a macro program to do this, with two mandatory arguments (sport n°1 and sport n°2) and an optional one which is the output dataset name.
One way to avoid many macro call is two automatize that through call execute.
Best,
data research;
length name $10 Value 8. Preferred_Sports $100;
infile datalines dsd;
input Name $ Value Preferred_Sports $;
datalines;
Jude,32,"Football,Basketball,Tennis,Darts"
Jack,39,"Hockey,Darts,Judo,Football"
Harry,20,"Basketball,Tennis,Hockey"
Tom,8,"Badminton,Tennis,Darts,Basketball"
Boris,14,"Badminton,Hockey,Judo"
Pier,18,"Judo"
Michael,12,"Darts,Polo,Tennis"
Roger,49,"Hockey,Darts,Judo"
Felix,1,"Football,Basketball,Tennis,Darts"
;
data interest;
input Sport:$20.;
datalines;
Football
Basketball
Darts
;
%macro sql_couple (couple1, couple2, dataset=summary);
proc sql;
create table &dataset. as
select
count (distinct name) as people,
sum (value) as total_value
from research
where Preferred_Sports contains "&couple1."
and Preferred_Sports contains "&couple2.";
quit;
proc print;
%mend;
%sql_couple (Football, Basketball)
/***** To create a dataset for all combinations (avoid many macro calls)*/
/* Create a list of all couples */
proc transpose data=interest out=interest_tr (drop=_:) prefix=sport;
var sport;
run;
data interest_couples;
set interest_tr;
length couple $ 200;
array f(*) sport:;
do i=1 to dim(f);
do j=i to dim(f);
if f(i) ne f(j) then do;
couple1 = f(i);
couple2 = f(j);
couple = catx(", ",f(i), f(j));
count+1;
dst=cats('table',count);
if not missing(couple) then output;
end;
end;
end;
run;
proc sort data=interest_couples out=interest_couples_list (keep=couple1 couple2 dst) nodupkey;
by couple;
run;
/* Call execute */
data _null_;
set interest_couples_list;
call execute (cats('%sql_couple (',couple1,',',couple2,', dataset=',dst,')'));
run;
Hi,
how about approach with Proc Tabulate:
data research;
length name $10 Value 8. Preferred_Sports $100;
infile datalines dsd;
input Name $ Value Preferred_Sports $;
datalines;
Jude,32,"Football,Basketball,Tennis,Darts"
Jack,39,"Hockey,Darts,Judo"
Harry,20,"Basketball,Tennis,Hockey"
Tom,8,"Badminton,Tennis,Darts"
Boris,14,"Badminton,Hockey,Judo"
Pier,18,"Judo"
Michael,12,"Darts,Polo,Tennis"
Roger,49,"Hockey,Darts,Judo"
Felix,1,"Football,Basketball,Tennis,Darts"
;
run;
data interest;
infile datalines dsd;
input Sport $;
datalines;
Football
Basketball
Darts
;
run;
proc sql noprint;
select Sport
into :Sports separated by " "
from interest
;
select Sport
into :SportStar separated by "*"
from interest
;
quit;
data research2;
set research;
array SPORTS &Sports.;
Preferred_Sports = upcase(Preferred_Sports);
do over SPORTS;
SPORTS = find(Preferred_Sports, upcase(vname(SPORTS)))&1;
end;
run;
proc print;
run;
proc tabulate data = research2 out = outputResults;
class &Sports.;
table &SportStar., n=" ";
run;
?
All the best
Bart
Hi @JackoNewbie
You can create a macro program to do this, with two mandatory arguments (sport n°1 and sport n°2) and an optional one which is the output dataset name.
One way to avoid many macro call is two automatize that through call execute.
Best,
data research;
length name $10 Value 8. Preferred_Sports $100;
infile datalines dsd;
input Name $ Value Preferred_Sports $;
datalines;
Jude,32,"Football,Basketball,Tennis,Darts"
Jack,39,"Hockey,Darts,Judo,Football"
Harry,20,"Basketball,Tennis,Hockey"
Tom,8,"Badminton,Tennis,Darts,Basketball"
Boris,14,"Badminton,Hockey,Judo"
Pier,18,"Judo"
Michael,12,"Darts,Polo,Tennis"
Roger,49,"Hockey,Darts,Judo"
Felix,1,"Football,Basketball,Tennis,Darts"
;
data interest;
input Sport:$20.;
datalines;
Football
Basketball
Darts
;
%macro sql_couple (couple1, couple2, dataset=summary);
proc sql;
create table &dataset. as
select
count (distinct name) as people,
sum (value) as total_value
from research
where Preferred_Sports contains "&couple1."
and Preferred_Sports contains "&couple2.";
quit;
proc print;
%mend;
%sql_couple (Football, Basketball)
/***** To create a dataset for all combinations (avoid many macro calls)*/
/* Create a list of all couples */
proc transpose data=interest out=interest_tr (drop=_:) prefix=sport;
var sport;
run;
data interest_couples;
set interest_tr;
length couple $ 200;
array f(*) sport:;
do i=1 to dim(f);
do j=i to dim(f);
if f(i) ne f(j) then do;
couple1 = f(i);
couple2 = f(j);
couple = catx(", ",f(i), f(j));
count+1;
dst=cats('table',count);
if not missing(couple) then output;
end;
end;
end;
run;
proc sort data=interest_couples out=interest_couples_list (keep=couple1 couple2 dst) nodupkey;
by couple;
run;
/* Call execute */
data _null_;
set interest_couples_list;
call execute (cats('%sql_couple (',couple1,',',couple2,', dataset=',dst,')'));
run;
This is great. Thank you so much! 🙂
Can I cheekily ask how I could combine all the tables together within the code where it is not using proc print? If I added more than 3 variables it could lead to >50 tables etc.
Jacko
data research;
length name $10 Value 8. Preferred_Sports $100 Sport $12;
infile datalines dsd;
input Name $ Value Preferred_Sports $;
do i = 1 to countc(Preferred_Sports,',');
Sport = scan(Preferred_Sports,i,',');
output;
end;
datalines;
Jude,32,"Football,Basketball,Tennis,Darts"
Jack,39,"Hockey,Darts,Judo"
Harry,20,"Basketball,Tennis,Hockey"
Tom,8,"Badminton,Tennis,Darts"
Boris,14,"Badminton,Hockey,Judo"
Pier,18,"Judo"
Michael,12,"Darts,Polo,Tennis"
Roger,49,"Hockey,Darts,Judo"
Felix,1,"Football,Basketball,Tennis,Darts"
;
data interest;
length Sport $12;
infile datalines dsd;
input Sport $;
datalines;
Football
Basketball
Darts
;
proc sql;
create table other as
select Name,Value,a.Sport
from
research a
innerjoin
interest b
on a.sport = b.sport
order by Name Value;
data Other(keep=people total_value);
set Other end=done;
by Name Value;
retain people 0 total_value 0;
if first.Name then do;
people=people+1;
total+value=total_value+Value;
end;
if done then output;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.