The SAS Output Delivery System and reporting techniques

PROC FREQ -- getting both unformatted and formatted values into CSV file

Reply
Contributor
Posts: 23

PROC FREQ -- getting both unformatted and formatted values into CSV file

We need to get one-way frequency output for several hundred variables into a single SAS data set, so that we can compare counts and percents across different years of data.

I looked at the data set produced by the OneWayFreqs ODS output option. It provides both unformatted and formatted values (e.g. columns for Sex and F_Sex), but the data set is hard to work with for a large number of tables, since it has a separate pair of columns for each tabulated variable.

It then occurred to me that it might be simpler to just route the PROC FREQ output to a CSV file, and it turns out that this output is much more compact and directly usable for my purposes. However, it only contains the formatted variable values, and it would be handy if I could get both formatted and unformatted.

One would think this would be possible, but after spending some time looking at the template code and PROC TEMPLATE code, I sure can't figure out how.
SAS Super FREQ
Posts: 8,742

Re: PROC FREQ -- getting both unformatted and formatted values into CSV file

Hi:
Are you using the format to "collapse" levels, like collapsing AGE=11 and AGE=12 into one category for frequency purposes or are you just formatting individual values??? Or are you just formatting character variables?? It's sort of hard to visualize what you are doing with formats and what you want from your description. What statistics do you want -- do you want the whole 9 yards (freq, percent, cum freq, cum percent as well as the formatted and unformatted variables)???

I would expect that if you were using the format to collapse that it would be virtually impossible to get both the formatted and unformatted values out of PROC FREQ. Can you mock something up with one of the the SASHELP files and a couple of variables to illustrate what you're getting and what you want?

The other issue that I see is that the define block for the Variable column is set to PRINT=OFF (because it inherits from Base.Freq.Variable which is set to PRINT=OFF) -- but when I turn PRINT=ON for the Variable column, I -DO- get the Variable and the FVariable columns in the CSV file, but they contain the same (formatted) values.

cynthia
Contributor
Posts: 23

Re: PROC FREQ -- getting both unformatted and formatted values into CSV file

Sometimes we collapse and sometimes we don't. For the ones where we just relabel without collapsing, the idea is to be able to see both the coded value in the data set, and the formatted value. When we collapse, it would be nice to have a "representative value" that we could use if necessary to get records back in the desired order, although there are ways we could work around that.

Here's a sample:

/* Do a little formatting on copy of sashelp.class for testing */
proc format;
value agef
0-12 = 'Tween'
13-19 = 'Teen'
;
value $sexf
'M' = 'Male'
'F' = 'Female'
;
run;

data tempclass;
set sashelp.class;
format age agef. sex $sexf.;
label
age = 'Age Group'
;
run;

/* Close ODS LISTING, specify ODS output data set and CSV file */
ods listing close;
ods output OneWayFreqs=MyProcFreqOutput;
filename csvout temp;
ods csv file=csvout;

/* Run the tables */
proc freq data=tempclass;
tables _all_ / missing;
run;

/* Close output destinations and reopen LISTING destination */
ods output close;
ods csv close;
ods listing;
Respected Advisor
Posts: 3,777

Re: PROC FREQ -- getting both unformatted and formatted values into CSV file

I'm not sure what you want as final output. I think you want to "normalize" the OneWayFreqs table. This does it in a way that I might want.



[pre]
data maybe;
length Variable $32 Vtype $1 _i_ nValue 8 cValue $50 Formatted $100;

if 0 then set MyProcFreqOutput;
array _c _character_;
array _n _numeric_;

if _n_ eq 1 then do;
declare hash vr();
vr.definekey('Variable','Vtype');
vr.definedata('Variable','Vtype','_i_');
vr.definedone();
do _i_ = 1 to dim(_c);
Variable = vname(_c[_i_]);
Vtype = vtype(_c[_i_]);
rc = vr.add();
end;
do _i_ = 1 to dim(_n);
Variable = vname(_n[_i_]);
Vtype = vtype(_n[_i_]);
rc = vr.add();
end;
rc = vr.output(dataset:'vr'); *not needed;
end;
set MyProcFreqOutput;
length nValue 8 cValue $50 Formatted $100;
Variable = scan(table,-1,' ');
Vtype = vtypeX(Variable);
formatted = vvalueX(Variable);
rc = vr.find();
select(Vtype);
when('N') nValue = _n[_i_];
when('C') cValue = _c[_i_];
end;
keep Variable Vtype nvalue cvalue formatted Freq: Perc: Cum:;
run;
proc print data=vr;
run;
proc print data=maybethis;
run;
[/pre]

Output

[pre]
n Cum Cum
Obs Variable Vtype Value cValue Formatted Frequency Percent Frequency Percent

1 Name C . Alfred Alfred 1 5.26 1 5.26
2 Name C . Alice Alice 1 5.26 2 10.53
3 Name C . Barbara Barbara 1 5.26 3 15.79
4 Name C . Carol Carol 1 5.26 4 21.05
5 Name C . Henry Henry 1 5.26 5 26.32
6 Name C . James James 1 5.26 6 31.58
7 Name C . Jane Jane 1 5.26 7 36.84
8 Name C . Janet Janet 1 5.26 8 42.11
9 Name C . Jeffrey Jeffrey 1 5.26 9 47.37
10 Name C . John John 1 5.26 10 52.63
11 Name C . Joyce Joyce 1 5.26 11 57.89
12 Name C . Judy Judy 1 5.26 12 63.16
13 Name C . Louise Louise 1 5.26 13 68.42
14 Name C . Mary Mary 1 5.26 14 73.68
15 Name C . Philip Philip 1 5.26 15 78.95
16 Name C . Robert Robert 1 5.26 16 84.21
17 Name C . Ronald Ronald 1 5.26 17 89.47
18 Name C . Thomas Thomas 1 5.26 18 94.74
19 Name C . William William 1 5.26 19 100.00
20 Sex C . F Female 9 47.37 9 47.37
21 Sex C . M Male 10 52.63 19 100.00
22 Age N 11.0 Tween 7 36.84 7 36.84
23 Age N 13.0 Teen 12 63.16 19 100.00
24 Height N 51.3 51.3 1 5.26 1 5.26
25 Height N 56.3 56.3 1 5.26 2 10.53
26 Height N 56.5 56.5 1 5.26 3 15.79
27 Height N 57.3 57.3 1 5.26 4 21.05
28 Height N 57.5 57.5 1 5.26 5 26.32
29 Height N 59.0 59 1 5.26 6 31.58
30 Height N 59.8 59.8 1 5.26 7 36.84
31 Height N 62.5 62.5 2 10.53 9 47.37
32 Height N 62.8 62.8 1 5.26 10 52.63
33 Height N 63.5 63.5 1 5.26 11 57.89
34 Height N 64.3 64.3 1 5.26 12 63.16
35 Height N 64.8 64.8 1 5.26 13 68.42
36 Height N 65.3 65.3 1 5.26 14 73.68
37 Height N 66.5 66.5 2 10.53 16 84.21
38 Height N 67.0 67 1 5.26 17 89.47
39 Height N 69.0 69 1 5.26 18 94.74
40 Height N 72.0 72 1 5.26 19 100.00
41 Weight N 50.5 50.5 1 5.26 1 5.26
42 Weight N 77.0 77 1 5.26 2 10.53
43 Weight N 83.0 83 1 5.26 3 15.79
44 Weight N 84.0 84 2 10.53 5 26.32
45 Weight N 84.5 84.5 1 5.26 6 31.58
46 Weight N 85.0 85 1 5.26 7 36.84
47 Weight N 90.0 90 1 5.26 8 42.11
48 Weight N 98.0 98 1 5.26 9 47.37
49 Weight N 99.5 99.5 1 5.26 10 52.63
50 Weight N 102.5 102.5 2 10.53 12 63.16
51 Weight N 112.0 112 2 10.53 14 73.68
52 Weight N 112.5 112.5 2 10.53 16 84.21
53 Weight N 128.0 128 1 5.26 17 89.47
54 Weight N 133.0 133 1 5.26 18 94.74
55 Weight N 150.0 150 1 5.26 19 100.00
[/pre]
Contributor
Posts: 23

Re: PROC FREQ -- getting both unformatted and formatted values into CSV file

Yep -- that's pretty much exactly what I'm looking for. Thanks!

I still can't quite understand why both the unformatted and formatted values are available in the output data set, but I can't get both directly in the other destinations.
SAS Super FREQ
Posts: 8,742

Re: PROC FREQ -- getting both unformatted and formatted values into CSV file

Mike:

When I create an ouput dataset from this code:
[pre]
ods listing;
proc format;
value agef 0-12 = 'tween'
13-high = 'teen';
run;

ods output onewayfreqs=work.testF;
proc freq data=sashelp.class;
tables age height;
format height 8.0 age agef.;
run;

proc print data=work.testF noobs;
run;
[/pre]

I see that the FORMATTED value is used for F_Variable and for Variable, even in the output dataset. Note how F_Age and Age are BOTH the formatted values and that F_Height and Height are BOTH using the format.
[pre]
Cum Cum
Table F_Age Age Frequency Percent Frequency Percent F_Height Height

Table Age tween tween 7 36.84 7 36.84 .
Table Age teen teen 12 63.16 19 100.00 .
Table Height . 1 5.26 1 5.26 51 51
Table Height . 1 5.26 2 10.53 56 56
Table Height . 2 10.53 4 21.05 57 57
Table Height . 1 5.26 5 26.32 58 58
Table Height . 1 5.26 6 31.58 59 59
Table Height . 1 5.26 7 36.84 60 60
Table Height . 3 15.79 10 52.63 63 63
Table Height . 2 10.53 12 63.16 64 64
Table Height . 2 10.53 14 73.68 65 65
Table Height . 3 15.79 17 89.47 67 67
Table Height . 1 5.26 18 94.74 69 69
Table Height . 1 5.26 19 100.00 72 72
[/pre]

So in my mind, it's a moot point whether you can reveal both Variable and F_Variable in the destination. I can change the TABLE template so that you can see both Variable and F_Variable in your ODS output.

But all you're going to see is the formatted value if you use the changed table template. Is that what you want???? I can send you the template-- but it's not going to produce anything close to the solution posted by data_null_.

As for WHY you can get the 2 columns with ODS OUTPUT and not otherwise without changing the table template ... I suspect, but don't know for sure that the PROC FREQ procedure writer had one of those "Surely, NOBODY would ever want to see both F_Variable and Variable in printed output" moments.

cynthia
Contributor
Posts: 23

Re: PROC FREQ -- getting both unformatted and formatted values into CSV file

Cynthia,

With the output data set, the "unformatted" variables (such as Age) actually contain the unformatted values, but with the formats associated with them. You can get the unformatted values simply by removing the format.

I did mess around with the template a bit, getting both columns into my CSV file, but with the formatted values for each. I had a certain amount of success with the following:

[pre]
/* Try modifying the table template */
proc template;
edit Base.Freq.OneWayList;
define Variable;
parent = Base.Freq.Variable;
print = ON;
format = best.;
end;
end;
run;

/* Do a little formatting on copy of sashelp.class for testing */
proc format;
value agef
0-12 = 'Tween'
13-19 = 'Teen'
;
value $sexf
'M' = 'Male'
'F' = 'Female'
;
run;

data tempclass;
set sashelp.class;
format age agef. sex $sexf.;
label
age = 'Age Group'
;
run;

/* Close ODS LISTING, specify ODS tagset */
ods listing close;
filename csvout temp;
ods csv file=csvout;

/* Run the tables */
proc freq data=tempclass;
tables _numeric_ / missing;
run;

/* Close output destinations and reopen LISTING destination */
ods csv close;
ods listing;
[/pre]

This gives me both unformatted and formatted in my CSV. But it only works if I'm only running frequencies for numeric variables. If I include character variables in my TABLES statement, I (not surprisingly) get a message from SAS -- WARNING: Wrong type of format for data type: best.

Now, if there were only a way to specify one format if the underlying column was numeric and a different one for character values ... ;-)
SAS Super FREQ
Posts: 8,742

Re: PROC FREQ -- getting both unformatted and formatted values into CSV file

Hi:
Try FORMAT=_UNDEF_; in your template code. It gets around the issue of character variables. But then it doesn't seem to treat numeric variables the way your solution treats them. If you could live with creating the output dataset and then displaying the output dataset to ODS CSV, that might be an alternate solution.

cynthia

[pre]
/* Try modifying the table template */
ods path work.tmp(update) sasuser.templat(update)
sashelp.tmplmst(read);

proc template;
edit Base.Freq.OneWayList;
define Variable;
parent = Base.Freq.Variable;
print = ON;
format = _UNDEF_;
end;
end;
run;
[/pre]
Contributor
Posts: 23

Re: PROC FREQ -- getting both unformatted and formatted values into CSV file

Yes -- so close and yet so far. _UNDEF_ obviously does "something" -- works great with character variables, and it obviously does something with previously unformatted numerics (the "unformatted" value is now truly numeric). But it still doesn't get rid of the user-written format.

Oh well. At least we have an alternative ... :-)
SAS Super FREQ
Posts: 8,742

Re: PROC FREQ -- getting both unformatted and formatted values into CSV file

Well, you could have 1 template for numeric variables (with BEST) in one item store and then a diff template for character variables (with _UNDEF_) and use a targeted ODS PATH statement to control which template was going to be used ... it would mean 2 proc freqs with an ODS PATH between them.

cynthia
Ask a Question
Discussion stats
  • 9 replies
  • 1901 views
  • 0 likes
  • 3 in conversation