I want to summarize data in the following way. I explored Proc Means but couldn't figured out an option to do this.
Data have,
Company Year Country of operations
A 2000 GBR
A 2000 GBR
A 2000 USA
A 2001 FRA
A 2002 USA
A 2002 KSA
A 2002 RUS
A 2002 KSA
A 2003
B 2000
Data Want
Company Year Number of Countries represented
A 2000 2
A 2001 1
A 2002 3
A 2003
B 2000
Thanks,
If you insist on proc means you would need some numeric dummy variable:
Data Have;
Input Company $ Year Country_of_operations $;
If not Missing (Country_of_operations) Then Flag=1;
Datalines;
A 2000 GBR
A 2000 GBR
A 2000 USA
A 2001 FRA
A 2002 USA
A 2002 KSA
A 2002 RUS
A 2002 KSA
A 2003 .
B 2000 .
;
Proc Sort Data=Have Nodupkey;
By Company Year Country_of_operations;
Run;
Proc Means Data=Have NoPrint;
Label Flag='Number of ..';
By Company Year;
Var Flag;
Output Out=Want (Drop=_: ) Sum=;
Run;
None of the standard procs do a distinct count as far as I know, two ways of accomplishin this is two proc freqs or a proc sql.
proc freq data=have;
table company*year*country/out=p1;
run;
proc freq data=p1;
table company*year/out=want1;
run;
proc sql;
create table want2 as
select company, year, count(distinct country) as count
from have
group by company, year
order by company, year;
quit;
I think this is what you want but your output doesn't match. If this is wrong explain how you came up with the numbers:
proc sql;
create table want as
select company,year,count(country) as number_of_countries
from have
group by company,year;
Steelers_In_DC - I want to count unique countries. Hence, in 2002 KSA is counted once.
If you insist on proc means you would need some numeric dummy variable:
Data Have;
Input Company $ Year Country_of_operations $;
If not Missing (Country_of_operations) Then Flag=1;
Datalines;
A 2000 GBR
A 2000 GBR
A 2000 USA
A 2001 FRA
A 2002 USA
A 2002 KSA
A 2002 RUS
A 2002 KSA
A 2003 .
B 2000 .
;
Proc Sort Data=Have Nodupkey;
By Company Year Country_of_operations;
Run;
Proc Means Data=Have NoPrint;
Label Flag='Number of ..';
By Company Year;
Var Flag;
Output Out=Want (Drop=_: ) Sum=;
Run;
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.