That is a NOTE not an ERROR so your import will still work. Please post your complete SAS log to confirm.
PROC IMPORT DATAFILE= "/home/u63997444/ECON 348/Master_use_edited_again2.csv"
OUT=Master_d
DBMS=csv
REPLACE;
GETNAMES=Yes;
RUN;
proc contents data=Master_d;
run;
proc means data=Master_d;
run;
data capsizedata;
set Master_d;
/* Assigning capsize based on Firm */
if 1 <= Firm <= 10 then capsize = "mega";
else if 11 <= Firm <= 20 then capsize = "large";
else if 21 <= Firm <= 30 then capsize = "mid";
else if 31 <= Firm <= 40 then capsize = "small";
else if 41 <= Firm <= 50 then capsize = "micro";
else capsize = "unknown"; /* Default case for capsize if Firm is not in expected range */
/* Convert capsize into numeric dummy variables for regression */
if capsize = "mega" then capsize_mega = 1; else capsize_mega = 0;
if capsize = "large" then capsize_large = 1; else capsize_large = 0;
if capsize = "mid" then capsize_mid = 1; else capsize_mid = 0;
if capsize = "small" then capsize_small = 1; else capsize_small = 0;
if capsize = "micro" then capsize_micro = 1; else capsize_micro = 0;
data capsizedata;
set Master_d;
/* Assigning capsizeadjust based on MarketCap */
if MarketCap >= 2 then capsizeadjust = "mega_ad";
else if 10000000000 <= MarketCap <= 200000000000 then capsizeadjust = "large_ad";
else if 2000000000 <= MarketCap <= 10000000000 then capsizeadjust = "mid_ad";
else if 300000000 <= MarketCap <= 2000000000 then capsizeadjust = "small_ad";
else if 50000000 <= MarketCap <= 300000000 then capsizeadjust = "micro_ad";
else capsizeadjust = "unknown"; /* Default case for capsizeadjust if MarketCap is not in expected range */
run;
proc reg data=Master_d;
Model Price= MarketCap;
run;
proc reg data=capsizedata;
model Price =capsize_mega capsize_large capsize_mid capsize_small capsize_micro;
run;
proc print data=Master_d (obs=10);
run;
proc print data=capsizedata (obs=10);
run;
Your errors start at this step:
143 144 proc reg data=Master_d; 145 Model Price= MarketCap; ERROR: Variable PRICE not found. ERROR: Variable MARKETCAP not found. NOTE: The previous statement has been deleted. 146 run;
My suspicion is that the variables are _Price and _MarketCap because there were leading spaces on the imported column headings you are using as variable names. What does the PROC CONTENTS of the dataset report?
I agree, PROC CONTENTS looks OK. Does a rerun give you the same error?
One of the things you will learn to consider is leading blanks in unexpected places and that the way 99% of the output created, HTML, PDF, RTF especially may not display them. Please run this code as an example:
data junk; var = " abc";output; var = " abc";output; var = "abc"; output; run; proc freq data=junk; tables var; run;
What does your output look like?
Mine:
var | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
abc | 1 | 33.33 | 1 | 33.33 |
abc | 1 | 33.33 | 2 | 66.67 |
abc | 1 | 33.33 | 3 | 100.00 |
NONE of the leading spaces of the values show but the different values are counted as separate by the Freq Procedure (and others).
The same happens with Proc Contents and the name literals.
data stupidnames; ' name1'n='Boy'; run; proc contents data=stupidnames; run;
with output in part again not showing the leading spaces.
Alphabetic List of Variables and Attributes | |||
---|---|---|---|
# | Variable | Type | Len |
1 | name1 | Char | 3 |
The name literals were added to SAS to allow references to variables in other data sources such as external databases a bit easier for those that allow other characters in names other than letters, digits or the _ character that SAS uses natively.
That highlights on of the biggest problems with ODS output. It "eats" leading spaces.
If you run the same code and send the results to the normal OUTPUT (what they now call the LISTING destination) then you will see the leading spaces.
Your real problem is that you are referencing values or variables not in your data set:
33 data capsizedata; 134 set Master_d; 135 /* Assigning capsizeadjust based on MarketCap */ 136 if MarketCap >= 2 then capsizeadjust = "mega_ad"; 137 else if 10000000000 <= MarketCap <= 200000000000 then capsizeadjust = "large_ad"; 138 else if 2000000000 <= MarketCap <= 10000000000 then capsizeadjust = "mid_ad"; 139 else if 300000000 <= MarketCap <= 2000000000 then capsizeadjust = "small_ad"; 140 else if 50000000 <= MarketCap <= 300000000 then capsizeadjust = "micro_ad"; 141 else capsizeadjust = "unknown"; /* Default case for capsizeadjust if MarketCap is not in expected range */ 142 run; NOTE: Variable MarketCap is uninitialized.
Uninitialized means the variable has no values assigned.
44 proc reg data=Master_d; 145 Model Price= MarketCap; ERROR: Variable PRICE not found. ERROR: Variable MARKETCAP not found. NOTE: The previous statement has been deleted. 146 run;
Reading your log closer there is this bit from proc import:
83 data WORK.MASTER_D ; 84 %let _EFIERR_ = 0; /* set the ERROR detection macro variable */ 85 infile '/home/u63997444/ECON 348/Master_use_edited_again2.csv' delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ; 86 informat Date mmddyy10. ; 87 informat " Price"N best32. ; 88 informat " MarketCap"N best32. ;
Proc import has read in the leading space in the source file as part of the name of the variable.
Which means to use that data set you must use " price"n or " marketcap"n to use those variables.
This is the result of the Option VALIDVARNAME=ANY, I suspect a default for your install and the way the file is read.
If you want to use the variables as Price and Marketcap you can edit the generated code from proc import to avoid that easily and reread the file:
data WORK.MASTER_D ; infile '/home/u63997444/ECON 348/Master_use_edited_again2.csv' delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ; informat Date mmddyy10. ; informat Price best32. ; informat MarketCap best32. ; informat Shares best32. ; informat Firm best32. ; informat FirmName $5. ; format Date mmddyy10. ; input Date Price MarketCap Shares Firm FirmName $ ; run;
Or RENAME the variables.
You also will not have the values you expect for some of the variables (when you get the code fixed above) because the character variables you create like this will have the length define by the first use if a LENGTH statement doesn't assign a length first. So Capsize will have a length of 4 characters (the number of letters in "mega") and the other values will actually be "larg" "mid" "smal" "micr" "unkn"
119 if 1 <= Firm <= 10 then capsize = "mega";
Later you will find other procedures such as Proc GLM that support CLASS variables so you don't have to create dummy variables as that can be done with your existing variables and custom formats.
The NOTE about the SASUSER.PARMS has nothing to do with the other problems in your code. That sort of note typically happens when there is some process attempting to use the catalog. Having two SAS sessions for the same user on a shared network might do it. I do know I can cause such with a stand-alone SAS foundation install by having two or more interactive sessions active at the same time on the computer.
Got it thanks so much. Will try again! Appreciate it!
BIG (oil-tanker-sized) hint for the future: never use PROC IMPORT for text files; write the DATA step yourself, according to the documentation/description of the file. Even if no such documentation exists, you are much better at making guesses about the data structure than the IMPORT procedure.
Don't use PROC IMPORT to read a CSV for anything important. You can use it help you understand what is in the file, but once you know just write a data step to read it.
So replace the PROC IMPORT call with this data step instead.
data MASTER_D ;
infile '/home/u63997444/ECON 348/Master_use_edited_again2.csv' dsd truncover firstobs=2 ;
input Date :mmddyy. Price MarketCap Shares Firm FirmName :$8. ;
format date yymmdd10.;
run;
Ready to level-up your skills? Choose your own adventure.