I feel like this should be really easy and obvious, but I can't figure it out. Any advice is appreciated!
It starts with just a simple crosstab. I need the output to be a list of values of x (number of events) for each level of y (regions), of which there are 13. There are 4 types of events, and I need a table to summarize each. Occasionally, there are 0 events in a region. However, I need the output to list all 13 regions. Is there some ridiculously simple solution that I've just overlooked? I tried the sparse option, and it didn't work.
proc freq data = have noprint;
table region * num_events / nopercent norow nocol
out = want;
run;
Thank you!!!!
SPARSE/LIST will not work in this case. You need to use PRELOADFMT or CLASSDATA instead.
If the data is not there, SAS cannot know it exists unless you tell it somehow. That 'somehow' is PRELOADFMT or CLASSDATA.
A quick search of those terms on here or lexjansen.com will illustrate fully worked examples that will show you how to modify your code/process.
@kjowers wrote:
Hi,
Attached are my data and syntax files. I've also pasted the syntax below. The issue is that I need the last 2 sheets on the exported .xls file to list all 13 regions, with 0 values as appropriate, rather than excluding the regions with 0 values. I can't seem to get the list/sparse combo to work - it outputs the total sample, not the subsetted samples.
/*** import current month's data ***/ proc import out = have_master datafile = 'W:\Have\have.xls' dbms = excel replace; getnames = yes; run; /*** assign region ***/ data have_1; set have_master; format region 2.; informat region 2.; if county in (2) then region = 1; if county in (30) then region = 2; if county in (3) then region = 3; if county in (16) then region = 4; if county in (15) then region = 5; if county in (13) then region = 6; if county in (1,11,21) then region = 7; if county in (5,9,14,17,20) then region = 8; if county in (19,22,23) then region = 9; if county in (4,8,18) then region = 10; if county in (7,12) then region = 11; if county in (10) then region = 12; if county in (6) then region = 13; run; /*** create output datasets ***/ /** subset variables **/ data have_2; set have_1; format lh 1.; format lc 1.; format ulh 1.; format ulc 1.; if type in ('registered') then lh = 1; if type in ('licensed') then lc = 1; if type in ('illegal') and illegal_type in ('h') then ulh = 1; if type in ('illegal') and illegal_type in ('c') then ulc = 1; run; /** tables **/ proc freq data = have_2; where type in ('registered'); tables lh * region / nopercent norow nocol out = lh; run; proc freq data = have_2; where type in ('licensed'); tables lc * region / nopercent norow nocol out = lc; run; proc freq data = have_2; where type in ('illegal') and illegal_type in ('h'); tables ulh * region / nopercent norow nocol out = ulh; run; proc freq data = have_2; where type in ('illegal') and illegal_type in ('c'); table ulc * region / nopercent norow nocol out = ulc; run; /*** export output to .xls ***/ proc export data = lh outfile = 'W:\Have\want.xls' dbms = excel replace; sheet = 'lh'; proc export data = lc outfile = 'W:\Have\want.xls' dbms = excel replace; sheet = 'lc'; proc export data = ulh outfile = 'W:\Have\want.xls' dbms = excel replace; sheet = 'ulh'; proc export data = ulc outfile = 'W:\Have\want.xls' dbms = excel replace; sheet = 'ulc'; run;
Thank you!!!!
Can you at least show what the result is expected to look like?
It actually is not "ridiculously easy" easy to display values that do not appear in your data.
Depending on what I wanted for appearance I would likely create a custom format to display your region values and then use one of the procedures that can use Preloadfmt option.
Since you are creating an output data set you could use the SPARSE option if you also use the LIST option on the tables statement.
Thanks, @ballardw - at least I don't feel quite so dumb.
Here's what I'm looking for at the minimum:
If the output dataset ends up including percent, col, and row, etc., that's cool. I have a second .xls with formulas reading in the values in the num_events column, ordered by region, summarizing 2 decades' worth of this monthly data by year.
For some reason, when I use list and sparse like so:
proc freq data = have;
where provider_type in ('Illegally-Operating');
table region * num_events / list
out = want sparse;
run;
this is my output.
What gives? 🙂
Thank you!!!
Keri
Computers are dumb. If it's not in the data at all, how is SAS supposed to know it even exists? Obviously you need to tell it in some manner, and the way to specify that is using a FORMAT or a CLASS data set with ALL the levels/combinations needed. If you search PRELOADFMT and PROC TABULATE you'll see many ways of getting this done.
http://support.sas.com/resources/papers/proceedings11/087-2011.pdf
http://support.sas.com/resources/papers/proceedings11/239-2011.pdf
Does NUM_EVENTS really represent the type of event? If not, then what is the right variable name?
Whatever the proper name, it's easier to add the missing categories in later. I can't test this right now, but it should create a shell with all the categories:
proc sql;
create table shell as
select distinct a.region, distinct b.num_events from have a, have b;
quit;
My SQL isn't super strong, but that should work. Then merge it back in with the PROC FREQ results:
proc sort data=shell;
by region num_events;
run;
data really_want;
merge want (in=in1) shell;
by region num_events;
if in1=0 then do;
count=0;
percent=0;
end;
run;
Better post some sample data to explain your question.
1)try SPARSE option
table region * num_events /
sparse ;
2) try ZERO option
weight count/zero ;
Hi @Ksharp,
Thanks for your help!
Here's a sample of my data. There are other variables, but I don't use them here.
When I run:
proc freq data = have;
where type in ('illegal');
table region * num_events_1 / sparse
out = want1;
run;
proc export data = want
outfile = '\want.xls'
dbms = excel replace;
sheet = 'want1';
run;
this is my output:
The only problem is that the values in 'count' are for the entire 'have' dataset, not just the type = 'illegal' subset. Any idea why? I'm not familiar with how 'sparse' works.
Thanks!!!
Please post your code AND data as text, otherwise you're asking us to type out your data and code.
There are instructions here on how to include your data as a data step so it can be used immediately in SAS
Thanks, @Reeza! Will do! That makes total sense, and I can see how that could be a royal pain. I'll attach my syntax and data as files to a previous reply - I'm having a hard time getting the syntax you linked to to work for me (it's definitely me, not the syntax). It looks really cool and helpful, so I'll definitely take more time later to try to figure out how it's working. Thanks again!!!
Hi,
Attached are my data and syntax files. I've also pasted the syntax below. The issue is that I need the last 2 sheets on the exported .xls file to list all 13 regions, with 0 values as appropriate, rather than excluding the regions with 0 values. I can't seem to get the list/sparse combo to work - it outputs the total sample, not the subsetted samples.
/*** import current month's data ***/
proc import out = have_master
datafile = 'W:\Have\have.xls'
dbms = excel replace;
getnames = yes;
run;
/*** assign region ***/
data have_1;
set have_master;
format region 2.;
informat region 2.;
if county in (2) then region = 1;
if county in (30) then region = 2;
if county in (3) then region = 3;
if county in (16) then region = 4;
if county in (15) then region = 5;
if county in (13) then region = 6;
if county in (1,11,21) then region = 7;
if county in (5,9,14,17,20) then region = 8;
if county in (19,22,23) then region = 9;
if county in (4,8,18) then region = 10;
if county in (7,12) then region = 11;
if county in (10) then region = 12;
if county in (6) then region = 13;
run;
/*** create output datasets ***/
/** subset variables **/
data have_2;
set have_1;
format lh 1.;
format lc 1.;
format ulh 1.;
format ulc 1.;
if type in ('registered') then lh = 1;
if type in ('licensed') then lc = 1;
if type in ('illegal') and illegal_type in ('h')
then ulh = 1;
if type in ('illegal') and illegal_type in ('c')
then ulc = 1;
run;
/** tables **/
proc freq data = have_2;
where type in ('registered');
tables lh * region / nopercent norow nocol
out = lh;
run;
proc freq data = have_2;
where type in ('licensed');
tables lc * region / nopercent norow nocol
out = lc;
run;
proc freq data = have_2;
where type in ('illegal') and illegal_type in ('h');
tables ulh * region / nopercent norow nocol
out = ulh;
run;
proc freq data = have_2;
where type in ('illegal') and illegal_type in ('c');
table ulc * region / nopercent norow nocol
out = ulc;
run;
/*** export output to .xls ***/
proc export data = lh
outfile = 'W:\Have\want.xls'
dbms = excel replace;
sheet = 'lh';
proc export data = lc
outfile = 'W:\Have\want.xls'
dbms = excel replace;
sheet = 'lc';
proc export data = ulh
outfile = 'W:\Have\want.xls'
dbms = excel replace;
sheet = 'ulh';
proc export data = ulc
outfile = 'W:\Have\want.xls'
dbms = excel replace;
sheet = 'ulc';
run;
Thank you!!!!
SPARSE/LIST will not work in this case. You need to use PRELOADFMT or CLASSDATA instead.
If the data is not there, SAS cannot know it exists unless you tell it somehow. That 'somehow' is PRELOADFMT or CLASSDATA.
A quick search of those terms on here or lexjansen.com will illustrate fully worked examples that will show you how to modify your code/process.
@kjowers wrote:
Hi,
Attached are my data and syntax files. I've also pasted the syntax below. The issue is that I need the last 2 sheets on the exported .xls file to list all 13 regions, with 0 values as appropriate, rather than excluding the regions with 0 values. I can't seem to get the list/sparse combo to work - it outputs the total sample, not the subsetted samples.
/*** import current month's data ***/ proc import out = have_master datafile = 'W:\Have\have.xls' dbms = excel replace; getnames = yes; run; /*** assign region ***/ data have_1; set have_master; format region 2.; informat region 2.; if county in (2) then region = 1; if county in (30) then region = 2; if county in (3) then region = 3; if county in (16) then region = 4; if county in (15) then region = 5; if county in (13) then region = 6; if county in (1,11,21) then region = 7; if county in (5,9,14,17,20) then region = 8; if county in (19,22,23) then region = 9; if county in (4,8,18) then region = 10; if county in (7,12) then region = 11; if county in (10) then region = 12; if county in (6) then region = 13; run; /*** create output datasets ***/ /** subset variables **/ data have_2; set have_1; format lh 1.; format lc 1.; format ulh 1.; format ulc 1.; if type in ('registered') then lh = 1; if type in ('licensed') then lc = 1; if type in ('illegal') and illegal_type in ('h') then ulh = 1; if type in ('illegal') and illegal_type in ('c') then ulc = 1; run; /** tables **/ proc freq data = have_2; where type in ('registered'); tables lh * region / nopercent norow nocol out = lh; run; proc freq data = have_2; where type in ('licensed'); tables lc * region / nopercent norow nocol out = lc; run; proc freq data = have_2; where type in ('illegal') and illegal_type in ('h'); tables ulh * region / nopercent norow nocol out = ulh; run; proc freq data = have_2; where type in ('illegal') and illegal_type in ('c'); table ulc * region / nopercent norow nocol out = ulc; run; /*** export output to .xls ***/ proc export data = lh outfile = 'W:\Have\want.xls' dbms = excel replace; sheet = 'lh'; proc export data = lc outfile = 'W:\Have\want.xls' dbms = excel replace; sheet = 'lc'; proc export data = ulh outfile = 'W:\Have\want.xls' dbms = excel replace; sheet = 'ulh'; proc export data = ulc outfile = 'W:\Have\want.xls' dbms = excel replace; sheet = 'ulc'; run;
Thank you!!!!
Awesome! Thank you so much!!! I really appreciate your help!
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.