I want to summarize data in the following way. I explored Proc Means but couldn't figured out an option to do this.
Data have,
Company Year Country of operations
A 2000 GBR
A 2000 GBR
A 2000 USA
A 2001 FRA
A 2002 USA
A 2002 KSA
A 2002 RUS
A 2002 KSA
A 2003
B 2000
Data Want
Company Year Number of Countries represented
A 2000 2
A 2001 1
A 2002 3
A 2003
B 2000
Thanks,
If you insist on proc means you would need some numeric dummy variable:
Data Have;
Input Company $ Year Country_of_operations $;
If not Missing (Country_of_operations) Then Flag=1;
Datalines;
A 2000 GBR
A 2000 GBR
A 2000 USA
A 2001 FRA
A 2002 USA
A 2002 KSA
A 2002 RUS
A 2002 KSA
A 2003 .
B 2000 .
;
Proc Sort Data=Have Nodupkey;
By Company Year Country_of_operations;
Run;
Proc Means Data=Have NoPrint;
Label Flag='Number of ..';
By Company Year;
Var Flag;
Output Out=Want (Drop=_: ) Sum=;
Run;
None of the standard procs do a distinct count as far as I know, two ways of accomplishin this is two proc freqs or a proc sql.
proc freq data=have;
table company*year*country/out=p1;
run;
proc freq data=p1;
table company*year/out=want1;
run;
proc sql;
create table want2 as
select company, year, count(distinct country) as count
from have
group by company, year
order by company, year;
quit;
I think this is what you want but your output doesn't match. If this is wrong explain how you came up with the numbers:
proc sql;
create table want as
select company,year,count(country) as number_of_countries
from have
group by company,year;
Steelers_In_DC - I want to count unique countries. Hence, in 2002 KSA is counted once.
If you insist on proc means you would need some numeric dummy variable:
Data Have;
Input Company $ Year Country_of_operations $;
If not Missing (Country_of_operations) Then Flag=1;
Datalines;
A 2000 GBR
A 2000 GBR
A 2000 USA
A 2001 FRA
A 2002 USA
A 2002 KSA
A 2002 RUS
A 2002 KSA
A 2003 .
B 2000 .
;
Proc Sort Data=Have Nodupkey;
By Company Year Country_of_operations;
Run;
Proc Means Data=Have NoPrint;
Label Flag='Number of ..';
By Company Year;
Var Flag;
Output Out=Want (Drop=_: ) Sum=;
Run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.