Help using Base SAS procedures

Subsetting a dataset output from Proc Tabulate

Reply
Occasional Contributor
Posts: 19

Subsetting a dataset output from Proc Tabulate

I have a dataset that contains electricity output from a variety of different types of fuels, and I'm grouping them into renewable and non-renewable energy by using a format in PROC TABULATE. I also output the table to a SAS dataset (I've tried this using both the OUT option and ODS). I would like to be able to then subset the table using a data step so I keep only the non-renewable energy and perform a calculation using one of the pieces of data in the output dataset. However, when I try to use a WHERE statement with the name I used for the format "non-renewable," SAS doesn't find any records. Apparently the Tabulate procedure outputs the data using the original source names then applies the format to the output dataset. So when I was looking at it in the results, I could see the records marked non-renewable, but the underlying data still reflects the original source names. That's why I can't find any records under non-renewable, because there is no "non-renewable" source name in the dataset. So that's kind of a long-winded way of asking if there's a way to output the data in such a way that the outputted fields reflect the format labels and not the original data. Thanks in advance for your help!

Super User
Posts: 5,082

Re: Subsetting a dataset output from Proc Tabulate

The data set itself will always contain the unformatted values.  Given your description of what you are trying to accomplish, the easiest way to subset might be to subset the output as part of PROC TABULATE:

... out=my_output_dataset (where=(put(variable, fmtname.)='non-renewable'));

You will still have to figure out what values are in the data set, but at least you'll be working with the proper subset.

Good luck.

Occasional Contributor
Posts: 19

Re: Subsetting a dataset output from Proc Tabulate

Thanks for your answer. I think I may just add another variable to the dataset using a data step and an if statement that includes all of the energy sources that I want in renewable/non-renewable. Then I can use PROC TABULATE to group them by the new variable.

Ask a Question
Discussion stats
  • 2 replies
  • 303 views
  • 1 like
  • 2 in conversation