Solved: Re: Proc Means - Variable already exists on file

emikea · Posted 03-25-2014 09:36 PM

Hello. I frequently use Proc Means and the following code, which was working, is now giving me the following warning:

NOTE: Variable MARKET_EQUITY_RATIO_PD__StdDev already exists on file DSTATS.MARKET_BLOOMBERG_DAILY_C_D, using MARKET_EQUITY_RATIO_PD__StdDev2 instead.

I have several variables that begin with that prefix (MARKET_EQUITY_RATIO_PD), and each one is giving me this note. However, it is only happening with the STDDEV variables that it is naming automatically. My variables aren't more than 23 characters long (before appending _STDDEV).

Any thoughts on what might be causing this problem?

%MACRO CREATE_DISTRIBUTION_DATA(INLIB = , DATASET_NAME = , FIELDS = );

PROC MEANS NOPRINT DATA = &INLIB..&DATASET_NAME.;

CLASS TICKER;

VAR &FIELDS.;

OUTPUT OUT = DSTATS.&DATASET_NAME._D

N(&FIELDS.) =

NMISS(&FIELDS.)=

P50(&FIELDS.)=

MEAN(&FIELDS.)=

P10(&FIELDS.)=

P25(&FIELDS.)=

P75(&FIELDS.)=

P90(&FIELDS.)=

STD(&FIELDS.)=

MAX(&FIELDS.)=

MIN(&FIELDS.)=

/AUTOLABEL AUTONAME;

RUN;

%MEND CREATE_DISTRIBUTION_DATA;

emikea · Posted 03-26-2014 08:15 AM

I found that when I shrunk the name of "Market_Equity_Ratio..." variables to "Mkt_Equity_Ratio...", the code works. Even though the original fields, when appended with _STDDEV, wouldn't be 32 characters long (which I thought was the limit), the fields were getting truncated. Anyway, it works with a shorter field name now. Perhaps there is some overhead when autoname is applied that is reserved for possible duplicate field names (even though I didn't have any in this case).

Thanks for taking a look.

View solution in original post

Reeza · Posted 03-25-2014 10:34 PM

Can you post a proc contents on your original dataset?

emikea · Posted 03-25-2014 11:00 PM

Yes- here is the proc contents output.

Reeza · Posted 03-25-2014 11:03 PM

I don't have SAS at home so can't view that...

emikea · Posted 03-25-2014 11:12 PM

Sorry. Here is an Excel file.

Reeza · Posted 03-26-2014 12:52 AM

Double check the fields in your list, that will happen if the field is listed twice.

emikea · Posted 03-26-2014 07:01 AM

I don't see any duplicates, unfortunately. And if so, wouldn't that happen for all the variables that are created automatically, and not just the STDDEV ones?

data_null__ · Posted 03-26-2014 07:23 AM

Why can't you just show us &FIELDS.

data_null__ · Posted 03-26-2014 07:36 AM

Notice that the last letter of your variables are being removed before _STDDEV is appended for the new name. So it is not duplicates created by you but by PROC MEANS. Now how do we fix it.

55         data class;
56            set sashelp.class;
57            rename age=MARKET_EQUITY_RATIO_PD_D
58                   weight=MARKET_EQUITY_RATIO_PD_M
59                   height=MARKET_EQUITY_RATIO_PD_Q;
60            run;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.CLASS has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds
      

61         %CREATE_DISTRIBUTION_DATA(INLIB=work,DATASET_NAME=class,FIELDS=MARKET_EQUITY_RATIO_PD_D
62         MARKET_EQUITY_RATIO_PD_M
63         MARKET_EQUITY_RATIO_PD_Q);
MPRINT(CREATE_DISTRIBUTION_DATA):   PROC MEANS NOPRINT DATA = work.class;
MPRINT(CREATE_DISTRIBUTION_DATA):   *CLASS TICKER;
MPRINT(CREATE_DISTRIBUTION_DATA):   VAR MARKET_EQUITY_RATIO_PD_D MARKET_EQUITY_RATIO_PD_M MARKET_EQUITY_RATIO_PD_Q;
MPRINT(CREATE_DISTRIBUTION_DATA):   OUTPUT OUT = class_D N(MARKET_EQUITY_RATIO_PD_D MARKET_EQUITY_RATIO_PD_M 
MARKET_EQUITY_RATIO_PD_Q) = NMISS(MARKET_EQUITY_RATIO_PD_D MARKET_EQUITY_RATIO_PD_M MARKET_EQUITY_RATIO_PD_Q)= 
P50(MARKET_EQUITY_RATIO_PD_D MARKET_EQUITY_RATIO_PD_M MARKET_EQUITY_RATIO_PD_Q)= MEAN(MARKET_EQUITY_RATIO_PD_D 
MARKET_EQUITY_RATIO_PD_M MARKET_EQUITY_RATIO_PD_Q)= P10(MARKET_EQUITY_RATIO_PD_D MARKET_EQUITY_RATIO_PD_M 
MARKET_EQUITY_RATIO_PD_Q)= P25(MARKET_EQUITY_RATIO_PD_D MARKET_EQUITY_RATIO_PD_M MARKET_EQUITY_RATIO_PD_Q)= 
P75(MARKET_EQUITY_RATIO_PD_D MARKET_EQUITY_RATIO_PD_M MARKET_EQUITY_RATIO_PD_Q)= P90(MARKET_EQUITY_RATIO_PD_D 
MARKET_EQUITY_RATIO_PD_M MARKET_EQUITY_RATIO_PD_Q)= STDdev(MARKET_EQUITY_RATIO_PD_D MARKET_EQUITY_RATIO_PD_M 
MARKET_EQUITY_RATIO_PD_Q)= MAX(MARKET_EQUITY_RATIO_PD_D MARKET_EQUITY_RATIO_PD_M MARKET_EQUITY_RATIO_PD_Q)= 
MIN(MARKET_EQUITY_RATIO_PD_D MARKET_EQUITY_RATIO_PD_M MARKET_EQUITY_RATIO_PD_Q)= /AUTOLABEL AUTONAME;
MPRINT(CREATE_DISTRIBUTION_DATA):   RUN;

NOTE: Variable MARKET_EQUITY_RATIO_PD__StdDev already exists on file WORK.CLASS_D, using MARKET_EQUITY_RATIO_PD__StdDev2 instead.
NOTE: Variable MARKET_EQUITY_RATIO_PD__StdDev already exists on file WORK.CLASS_D, using MARKET_EQUITY_RATIO_PD__StdDev3 instead.
NOTE: There were 19 observations read from the data set WORK.CLASS.
NOTE: The data set WORK.CLASS_D has 1 observations and 35 variables.
NOTE: PROCEDURE MEANS used (Total process time):
      real time           0.02 seconds
      cpu time            0.02 seconds

emikea · Posted 03-26-2014 08:15 AM

I found that when I shrunk the name of "Market_Equity_Ratio..." variables to "Mkt_Equity_Ratio...", the code works. Even though the original fields, when appended with _STDDEV, wouldn't be 32 characters long (which I thought was the limit), the fields were getting truncated. Anyway, it works with a shorter field name now. Perhaps there is some overhead when autoname is applied that is reserved for possible duplicate field names (even though I didn't have any in this case).

Thanks for taking a look.

data_null__ · Posted 03-26-2014 08:16 AM

Seems to be statistics with names that are 6 characters long. I would look to another output option perhaps ODS OUTPUT with the STACKODS proc statement option..

NOTE: Variable MARKET_EQUITY_RATIO_PD__SumWgt already exists on file WORK.CLASS_D, using MARKET_EQUITY_RATIO_PD__SumWgt2 instead.
NOTE: Variable market_equity_ratio_pd__SumWgt already exists on file WORK.CLASS_D, using market_equity_ratio_pd__SumWgt3 instead.
NOTE: Variable MARKET_EQUITY_RATIO_PD__QRange already exists on file WORK.CLASS_D, using MARKET_EQUITY_RATIO_PD__QRange2 instead.
NOTE: Variable market_equity_ratio_pd__QRange already exists on file WORK.CLASS_D, using market_equity_ratio_pd__QRange3 instead.
NOTE: Variable MARKET_EQUITY_RATIO_PD__Median already exists on file WORK.CLASS_D, using MARKET_EQUITY_RATIO_PD__Median2 instead.
NOTE: Variable market_equity_ratio_pd__Median already exists on file WORK.CLASS_D, using market_equity_ratio_pd__Median3 instead.
NOTE: Variable MARKET_EQUITY_RATIO_PD__StdDev already exists on file WORK.CLASS_D, using MARKET_EQUITY_RATIO_PD__StdDev2 instead.
NOTE: Variable market_equity_ratio_pd__StdDev already exists on file WORK.CLASS_D, using market_equity_ratio_pd__StdDev3 instead.
NOTE: Variable MARKET_EQUITY_RATIO_PD__StdErr already exists on file WORK.CLASS_D, using MARKET_EQUITY_RATIO_PD__StdErr2 instead.
NOTE: Variable market_equity_ratio_pd__StdErr already exists on file WORK.CLASS_D, using market_equity_ratio_pd__StdErr3 instead.

Reeza · Posted 03-26-2014 10:21 AM

I've seen this before, where the proc takes only 20 or so characters for the name and then truncates it. There's an option to fix it but I can't remember what it is

Peter_C · Posted 03-27-2014 04:48 PM

Myra first pointed out this difficulty of /AUTONAME creating non-unique names when the stat-name is appended.

There are two SASGF presentations on %BETTER_MEANS macro approaches.

1 used a OUTPUT statement for each statistic and "merged" them after.

2 renamed with VARNUM which is short enough to suffer the widest statistic name appended.

Should be quick to find at www.lexjansen.com

Registration is open

SAS Training: Just a Click Away