Hi all,
The dataset is:
FY TC
2013 1
2014 5
2013 6
2015 7
2016 1
2015 5
2016 2
2014 2
2013 7
2014 4
2017 5
2018 1
2018 6
2015 4
2014 2
2015 4
The important point in the above output is that there is no data point for TC=3 but I want it in my output dataset, which I need later for calculation in another step. Again this TC=3 data unavailability is just for depiction only and for one particular category (eg. commercial real estate). For other categories, I might have data points missing for TC=4 (e..g for residential real estate). So I need a cross table where I can have frequency columns for each from TC=1 to TC=7 irrespective of the fact whether any datapoint is available for TC=1 to TC=7 or not.
I am well aware of PROC REPORT but it is not creating tables for TC=3. I think it can be done using PROC SQL. Please help me here. I prefer PROC SQL, PROC REPORT as their output can be used easily in a later step.
Not preferred: PROC TABULATE, PROC FREQ
This can be easily done in SQL with sum.
data sample;
input FY TC;
datalines;
2013 1
2014 5
2013 6
2015 7
2016 1
2015 5
2016 2
2014 2
2013 7
2014 4
2017 5
2018 1
2018 6
2015 4
2014 2
2015 4
;
proc print;
run;
proc sql;
select FY,
sum(TC=1) as tc1,
sum(TC=2) as tc2,
sum(TC=3) as tc3,
sum(TC=4) as tc4,
sum(TC=5) as tc5,
sum(TC=6) as tc6,
sum(TC=7) as tc7
from sample
group by FY;
quit;
This can be easily done in SQL with sum.
data sample;
input FY TC;
datalines;
2013 1
2014 5
2013 6
2015 7
2016 1
2015 5
2016 2
2014 2
2013 7
2014 4
2017 5
2018 1
2018 6
2015 4
2014 2
2015 4
;
proc print;
run;
proc sql;
select FY,
sum(TC=1) as tc1,
sum(TC=2) as tc2,
sum(TC=3) as tc3,
sum(TC=4) as tc4,
sum(TC=5) as tc5,
sum(TC=6) as tc6,
sum(TC=7) as tc7
from sample
group by FY;
quit;
proc format;
value yrfmt
2013=2013
2014=2014
2015=2015
2016=2016
2017=2017
;
value tcfmt
1=1
2=2
3=3
4=4
5=5
6=6
7=7
;
run;
proc tabulate data=sample out=counts;
class FY tc / preloadfmt;
format tc tcfmt. fy yrfmt.;
table FY*tc / printmiss;
run;
proc sort data=counts;
by fy;
run;
proc transpose data=counts out=counts_t (drop=_name_) prefix=tc;
by fy;
var N;
id tc;
run;
You can also do with proc tabulate despite you saying it's against your preference. Preloadfmt allows you to get all possible combinations of TC and FY so that you can get the 0s for TC3. You can output the data from proc tabulate and then sort and transpose it.
A PROC REPORT solution:
data sample;
input FY TC;
datalines;
2013 1
2014 5
2013 6
2015 7
2016 1
2015 5
2016 2
2014 2
2013 7
2014 4
2017 5
2018 1
2018 6
2015 4
2014 2
2015 4
;
data intermediate;
set sample end=eof;
weight=1;
output;
if eof then do;
tc=3;
weight=0;
output;
end;
run;
options missing=0;
proc report data=intermediate;
columns fy tc,weight;
define fy/group;
define tc/across;
define weight/sum ' ';
rbreak after/summarize;
run;
I prefer PROC SQL, PROC REPORT as their output can be used easily in a later step. Not preferred: PROC TABULATE, PROC FREQ
All output from any SAS PROC can be used in a later step.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.