So each line of data I have has two numerical variables and one categorical variable.
I would like to have a report where the row and column to be range of the numerical variables and in each cell will have freq of categorical variables.
For example if I have data:
35000 3 Sold
10000 5 Lease
21000 1 Lease
21000 1 Sold
9000 0 Sold
18000 1 In
36000 12 Sold
25000 4 Lease
17000 2 Sold
I want the final format to look like:
Where the first does frequency for all the status and second only for one status.
This is what I have so far:
data cars;
input Price Age Status$;
cards;
35000 3 Sold
10000 5 Lease
21000 1 Lease
21000 1 Sold
9000 0 Sold
18000 1 In
36000 12 Sold
25000 4 Lease
17000 2 Sold
;
run;
proc tabulate data=cars;
class Price Age Status;
tables Price,
Age*Status*pctn<Status>={label="%"};
run;
If anything is unclear, please let me know and I'll try my best to clarify.
I won't be back on the community until tomorrow morning.
Thanks!
@Luciferene wrote:
Some formatting questions:
How would I go about removing the rows that say Status and PctN?
On those key words in the TABLE statement set them to ' '
TABLES Price,
Age*Status*(pctn<Status>=''*FORMAT=mypct.);
On the reverse, is it possible for only the SOLD column to show and others to not show even when there are values. I think that would basically get me the second table I wanted.
Yes, but that should happen automatically - for example in your table the OLD doesn't have an IN group.
Yes it is. Thanks for the guidance, applying formats gives information I wanted for the first table.
DATA cars;
INPUT Price Age Status$;
CARDS;
35000 3 Sold
10000 5 Lease
21000 1 Lease
21000 1 Sold
9000 0 Sold
18000 1 In
36000 12 Sold
25000 4 Lease
17000 2 Sold
;
RUN;
PROC FORMAT;
VALUE ageformat 0 = "New"
1 - 3 = "Mid"
4 - high = "Old";
VALUE priceformat 0 - 10000 = "Up to $10,000"
10001 - 20000 = "$10,001 - $20,000"
20001 - 30000 = "$20,001 - $30,000"
30001 - high = "$30,000+";
PICTURE mypct (ROUND) low-high='009.99%';
RUN;
PROC TABULATE DATA=cars;
CLASS Price Age Status;
FORMAT Age ageformat.;
FORMAT Price priceformat.;
TABLES Price,
Age*Status*(pctn<Status>*FORMAT=mypct.);
RUN;
Some formatting questions:
How would I go about removing the rows that say Status and PctN?
Is it possible for columns that aren't populated to show up as 0%? For example, only SOLD status shows up for NEW, how can I get IN and LEASE to show up even though there aren't any data that fit in that category.
On the reverse, is it possible for only the SOLD column to show and others to not show even when there are values. I think that would basically get me the second table I wanted.
Thanks!
@Luciferene wrote:
Some formatting questions:
How would I go about removing the rows that say Status and PctN?
On those key words in the TABLE statement set them to ' '
TABLES Price,
Age*Status*(pctn<Status>=''*FORMAT=mypct.);
On the reverse, is it possible for only the SOLD column to show and others to not show even when there are values. I think that would basically get me the second table I wanted.
Yes, but that should happen automatically - for example in your table the OLD doesn't have an IN group.
Thanks! I got it to work with the help you gave me.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.