- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
For an assignment I am completing, our class is required to run the proc means procedure on a dataset in order to produce descriptive statistics about the dataset. However, some of the variables that I want to get descriptive statistics on will not work properly. I get an error message that says:
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Either way, what this means is when you imported the data certain variables were imported as character values instead of numeric variables. It doesn't make sense to do summary statistics on character variables. So you need to go back to your data import step and fix your import code so that imports your data as numeric.
If you used PROC IMPORT see the code that is generated in the log and modify it according or write your own data step to import the data correctly. You can convert it after the fact but that's more work and it's not a good idea. PROC IMPORT is a guessing procedure so if you had to repeat this process you cannot guarantee that the next time you import the data you will have the same types assigned which would make this an ongoing problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm not sure how to change certain variables to numeric. Or how to import the dataset again and specify that I want all variables to be numeric.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I imported the data using proc import. But I also tried to manually import the data.
My variables are already listed in the graphical output that was displayed after I imported the dataset as "0" or "1" and sometimes even "3" and "4" and so on.
However when I looked at the graph, certain variables were imported as characters rather than numeric
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@u58469353 wrote:
I imported the data using proc import. But I also tried to manually import the data.
My variables are already listed in the graphical output that was displayed after I imported the dataset as "0" or "1" and sometimes even "3" and "4" and so on.
However when I looked at the graph, certain variables were imported as characters rather than numeric
Show us a screen capture of what you are seeing. Otherwise, these words are not helping.
To show us a screen capture, please click on the "Insert Photos" icon. Do not provide file attachments.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
There is no point in importing again, you can work with the existing SAS data set.
data new;
set old;
gender_numeric = input(gender,2.);
run;
This converts gender to numeric 0s and 1s. If you compute the mean, this is actually the percent of records that are 1.
It makes no logical sense to convert region to numeric and then run PROC MEANS on it. So don't do that.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So I got Paige Miller's suggestion to work. However when I use this code it only produces one of the new variables at a time. I can see gender_new on the graphical output that is displayed. However, I need gender_numeric and region_numeric to both be displayed at the same time and on the same graphical output where they are showing on the table.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You seem to have missed the point. Computing means of region_numeric is meaningless. A waste of time. Don't bother.
As stated above by @Tom , use PROC FREQ on region.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The two variables you mention as being character are not variables you would want to run PROC MEANS on anyway.
To see the distribution of GENDER and REGION you should probably just use PROC FREQ instead.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Ok, thanks for all of your help. I really appreciate it. I'm completely new to SAS so I may be misunderstanding the instructions of the assignment
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So I need these variables to be in numeric form so that I can perform multiple linear regression using these variables. I'm still encountering the problem when I use this code to change a variable to numeric form.
data new;
set work.import;
gender_numeric = input(gender,2.);
data new;
set work.import;
region_numeric = input(region,4.);
data new;
set work.import;
marital_numeric = input(marital,2.);
The issue I run into is that only the last variable I enter shows up in the dataset as a numeric variable. In this case, the marital_numeric variable appears. But the region_numeric and the gender_numeric variables are not showing up. However, if I only do one at a time I am able to use them for graphs and regressions. But I need all 3 of them together to perform a multiple linear regression model.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you're using PROC REG then you need to manually create NEW dummy variables. For gender you could convert it directly, but for region you will need N-1 variables, with N being the number of distinct regions so the conversion isn't useful. I re-iterate. Go back to your import step and ensure your data types are read in correctly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply. I am completely new to SAS coding. So honestly I'm not sure how to perform the suggestions you are telling me about. When I used the code I just posted, it does change the variables to numeric and I have no issues with completing to multiple linear regression. At first ,the dataset I imported did contain 3 variables that showed up as characters rather than numeric, and they are causing me problems.