I have high and low risk buckets of fees generated from 20 form types we have used over the years to charge customers for different services. Some forms are over two years old making them high risk for auditors. Fees change so revenue changes from year to year and we can process everything fast enough so there is a backlog. Forms less than two years or 1.5 years to the present are low risk. By employing a user defined format I assign a date interval as high risk and low risk for receipts and they get put into risk buckets accordingly, these are character variables. I have a coulmn for each form with high or low risk assigned as a character string.
PROBLEM: The fee column next door to the user defined character high or low risk columns is a numeric variable: fee.
When I go to put these in buckets in proc means the risk date format gets turned into number and anyhow I can't put fees in buckets for time interval risk. I need each user defined high or low risk record to attach to the fee amount and head for the right risk bucket.
Here's what I gathered from your description. Let me know if this is accurate:
1. You have created a user defined format which displays SAS date values as "High Risk" or "Low Risk".
2. You have a series of column pairs: dates and fees. Are both these column types numeric? They probably should be.
3. You want to be able to use PROC MEANS (aka the summary statistics task) to summarize by "High Risk" and "Low Risk".
In the above case, you'll use your date column as a class variable, format it with the format you created, and use your fee column as the analysis variable. Here's some code I wrote, which hopefully is similar to what you're working with:
Thanks, sorry it took awhile to get back. But the date column is in SAS date format as a number. I use a query assigning a user defined format to maildate. This creates a date interval for high and low risk.
The result is a colum with each record assigned as high or low risk in character. When I go to filter the column I get a SAS date number for the record at the same time I get the high or low risk lable.
If I were to make them both numeric (I assume that the SAS date is not numeric) how can I get them to be the same?
Ah, it sounds like you really want to be able to filter on "High Risk" or "Low Risk". As you discovered, the filter will be processed against the unformatted date values, so the simple stuff won't work. What you could do is the following in an advanced filter:
The PUT function converts numeric to character, among its many charms. You could alternately create a brand new column, say Risk, and as the expression use put(date,bucket.). That would let you have a phyiscal column in the data for filtering, exporting, etc.