BookmarkSubscribeRSS Feed
u58469353
Calcite | Level 5

For an assignment I am completing, our class is required to run the proc means procedure on a dataset in order to produce descriptive statistics about the dataset. However, some of the variables that I want to get descriptive statistics on will not work properly. I get an error message that says:

 

ERROR: Variable gender in list does not match type prescribed for this list.
ERROR: Variable region in list does not match type prescribed for this list.
 
Here is an attached link showing the dataset that I am using. 
 
Screen Shot 2021-07-26 at 1.05.48 PM.pngScreen Shot 2021-07-26 at 1.05.57 PM.png
I'm not sure if the fact that some variables only have 0,1,2,3 as options is causing proc means to not run correctly or what? Any help would be appreciated. Thank you
 
Also. When I run proc means and don't specify which variables I want to run it for. The output produced leaves off 3 of the variables that I need statistics for.
23 REPLIES 23
Reeza
Super User
Those are screenshots from a different application, not SAS (Numbers?)

Either way, what this means is when you imported the data certain variables were imported as character values instead of numeric variables. It doesn't make sense to do summary statistics on character variables. So you need to go back to your data import step and fix your import code so that imports your data as numeric.

If you used PROC IMPORT see the code that is generated in the log and modify it according or write your own data step to import the data correctly. You can convert it after the fact but that's more work and it's not a good idea. PROC IMPORT is a guessing procedure so if you had to repeat this process you cannot guarantee that the next time you import the data you will have the same types assigned which would make this an ongoing problem.
u58469353
Calcite | Level 5

I'm not sure how to change certain variables to numeric. Or how to import the dataset again and specify that I want all variables to be numeric.

Reeza
Super User
How did you import your data?
u58469353
Calcite | Level 5

I imported the data using proc import. But I also tried to manually import the data.

 

My variables are already listed in the graphical output that was displayed after I imported the dataset as "0" or "1" and sometimes even "3" and "4" and so on. 

 

However when I looked at the graph, certain variables were imported as characters rather than numeric 

PaigeMiller
Diamond | Level 26

@u58469353 wrote:

I imported the data using proc import. But I also tried to manually import the data.

 

My variables are already listed in the graphical output that was displayed after I imported the dataset as "0" or "1" and sometimes even "3" and "4" and so on. 

 

However when I looked at the graph, certain variables were imported as characters rather than numeric 


Show us a screen capture of what you are seeing. Otherwise, these words are not helping.

 

To show us a screen capture, please click on the "Insert Photos" icon. Do not provide file attachments.

--
Paige Miller
PaigeMiller
Diamond | Level 26

There is no point in importing again, you can work with the existing SAS data set.

 

data new;
    set old;
    gender_numeric = input(gender,2.);
run;

This converts gender to numeric 0s and 1s. If you compute the mean, this is actually the percent of records that are 1.

 

It makes no logical sense to convert region to numeric and then run PROC MEANS on it. So don't do that.

--
Paige Miller
u58469353
Calcite | Level 5

So I got Paige Miller's suggestion to work. However when I use this code it only produces one of the new variables at a time. I can see gender_new on the graphical output that is displayed. However, I need gender_numeric and region_numeric to both be displayed at the same time and on the same graphical output where they are showing on the table. 

PaigeMiller
Diamond | Level 26

You seem to have missed the point. Computing means of region_numeric is meaningless. A waste of time. Don't bother.

 

As stated above by @Tom , use PROC FREQ on region.

--
Paige Miller
Reeza
Super User
Actually, if you look really closely at the file note that the fields Gender and Region have the values aligned to the left, where numeric values are aligned to the right. This indicates your source data for some reason is thinking the file has Region and Gender coded as character variables. Those values should not be numerically summarized as the average region or average gender don't convey super useful information. Average gender would tell you the percentage of Males but PROC FREQ does that as well.
Tom
Super User Tom
Super User

The two variables you mention as being character are not variables you would want to run PROC MEANS on anyway.

To see the distribution of GENDER and REGION you should probably just use PROC FREQ instead.

u58469353
Calcite | Level 5

Ok, thanks for all of your help. I really appreciate it. I'm completely new to SAS so I may be misunderstanding the instructions of the assignment

u58469353
Calcite | Level 5

So I need these variables to be in numeric form so that I can perform multiple linear regression using these variables. I'm still encountering the problem when I use this code to change a variable to numeric form.

 

data new;
    set work.import;
    gender_numeric = input(gender,2.);
data new;
	set work.import;
	region_numeric = input(region,4.);
data new;
	set work.import;
	marital_numeric = input(marital,2.);

The issue I run into is that only the last variable I enter shows up in the dataset as a numeric variable. In this case, the marital_numeric variable appears. But the region_numeric and the gender_numeric variables are not showing up. However, if I only do one at a time I am able to use them for graphs and regressions. But I need all 3 of them together to perform a multiple linear regression model. 

Reeza
Super User
No you do not need to convert to numeric, they're categorical variables so they go in the CLASS statement in PROC GLM.

If you're using PROC REG then you need to manually create NEW dummy variables. For gender you could convert it directly, but for region you will need N-1 variables, with N being the number of distinct regions so the conversion isn't useful. I re-iterate. Go back to your import step and ensure your data types are read in correctly.

u58469353
Calcite | Level 5

Thanks for your reply. I am completely new to SAS coding. So honestly I'm not sure how to perform the suggestions you are telling me about. When I used the code I just posted, it does change the variables to numeric and I have no issues with completing to multiple linear regression. At first ,the dataset I imported did contain 3 variables that showed up as characters rather than numeric, and they are causing me problems.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 23 replies
  • 2321 views
  • 1 like
  • 5 in conversation