BookmarkSubscribeRSS Feed
RebeccaFaye
Calcite | Level 5

Hi there, 

 

First of all, my main objective is to try to perform a dummy variable regression for data that has a continuous outcome and continuous and categorical predictor variables. I keep getting this error: "ERROR: No valid observations are found." It said this for my dummy variable regression as well as more simple multivariate linear regressions where I only have continuous variables. Below is my code and attached are my data sets. 

 

CODE: 

data seven.final_table7dummy;
set seven.final_table5;
if gender= "F" then female=1; else female=0;
if gender= "M" then male=1; else male=0;
if gas= "C3F8" then C3F8=1; else C3F8=0;
if gas= "HLPC" then HLPC=1; else HLPC=0;
if gas= "SB" then SB=1; else SB=0;
if gas= "SB/cryo" then SBcryo=1; else SBcryo=0;
if gas= "SF6" then SF6=1; else SF6=0;
if gas= "SO" then SO=1; else SO=0;
if lens_status= "Phakic" then phakic=1; else phakic=0;
if lens_status= "Pseudophakic" then pseudophakic=1; else pseudophakic=0;
if race_eth= "American Indian" then American_Indian=1; else american_indian=0;
if race_eth= "Asian" then asian=1; else asian=0;
if race_eth= "Black" then black=1; else black=0;
if race_eth= "Caucasian" then caucasian=1;else caucasian=0;
if race_eth= "Hispanic" then hispanic=1; else hispanic=0;
if race_eth= "Unknown" then unknown=1; else unknown=0;
run;
*Table 7: multivariable dummy variable regression;
proc reg data=seven.final_table7dummy;
model BSCVA_final= days_between_mac_off_and_surg age female male c3f8 hlpc sb sbcryo sf6 so phakic pseudophakic american_indian asian black hispanic caucasian unknown;
run;
quit; 

 

*trying a multivariate linear regression on non-dummy data set only using continuous variables; 

 

proc reg data=seven.final_table5;
model BSCVA_final= days_between_mac_off_and_surg age;
run;
quit;

 

5 REPLIES 5
pink_poodle
Barite | Level 11

You have so many variables, the procedure is not finding any records where all of them are non-missing. It excludes the cases if any of the variables are missing. If you would like to include cases with missing variables, you can do multiple imputation like this (here, imputing with -1's)*:
* alternative is to analyze in univariate format (one variable at a time);
data imp;
set have;
array c days_between_mac_off_and_surg age female male c3f8 hlpc sb sbcryo sf6 so phakic pseudophakic american_indian asian black hispanic caucasian;
do over c;
if missing(c) then c = -1;
end;
run;

ballardw
Super User

When you get an error copy from the LOG the entire procedure or data step generating the error and all notes, warnings or errors, paste on the forum into a code box opened with the </> icon to preserve formatting.

 

Why the Log, it is amazing how often we see "code" that is not what was actually submitted. Also there may be other clues, such as "data set has 0 records".

 

Plus you have two different Proc Reg calls and you don't indicate which one generated the error.

Any record that is missing any value for any of the variables on the MODEL statement will be excluded from the calculations. If every record has at least one of the variables missing something then that is one cause of that specific error.

 

Proc reg is really not the place to analyze lots of 2 level variables. What is your analysis question for this data?

 

I might suspect that Proc Logistic with all of those recoded 0/1 variables as CLASS variables might be more appropriate.

PaigeMiller
Diamond | Level 26

Please look at the data set you are creating, seven.final_table7dummy, with your own eyes and see if there are valid observations (with no missing values)

 

Also, there is no need to produce your own dummy variables. SAS has programmed this for you, so you don't have to. You can use PROC GLM with a CLASS statement to do a regression with some categorical variables and some continuous variables.

--
Paige Miller
Doc_Duke
Rhodochrosite | Level 12
"ERROR: No valid observations are found." means that every observation is has a missing value for one or more of the predictor or outcome variables. Since it occurs for the continuous variables only, that's the place to focus.

Your dummy variable regression is likely overparameterized. Unless you have a lot of missing gender values, you don't need indicators for both male and female. Ditto gas and race.

You may want to try PROC GLM unless you ultimately want to use some of the stepwise parameters of PROC REG.
RebeccaFaye
Calcite | Level 5

I actually figured it out. I was simple, I was using the wrong variable for mac of date which had unknown values. I'll probably run some fit diagnostics to see if it does have over parameterization, but I'm not sure this error would be an indicator of that. It is usually for some missing values in your data. Thanks though! 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 3032 views
  • 0 likes
  • 5 in conversation