Hello, I set up a variable called "presence" which is 0 if there is no university and 1 if there is a university present. This data is merged by FIPS with crime data, and I am wondering how I can use proc means to create meaningful descriptive statistics. Unfortunately, I keep meeting the error message "variable presence in list does not match type prescribed for this list." Do you have any suggestions? I am new to sas and appreciate the help! Here's my progress so far:
data paper.arrests_schools;
set paper.arrests_schools;
presence=instnm;
if instnm=. then presence=1;
if instnm=Z then presence=0;
run;
/*not working*/
PROC MEANS data=paper.arrests_schools n mean median std cv min max skew maxdec=2;
var presence;
run;
Is INSTNM a numeric or character variable? This statement will create PRESENCE as the same type of variable that INSTNM is.
presence=instnm;
This line will treat INSTNM as if it is a numeric variable since you are comparing it to a numeric missing value.
if instnm=. then presence=1;
This line is comparing INSTNM to a variable named Z. What type of variable is Z?
if instnm=Z then presence=0;
If you just want to create a binary variable to indict whether INSTNM is missing or not why not just use the MISSING() function?
presence = not missing(instnm) ;
That way you don't care whether INSTNM is numeric or character.
Or if you want to treat INSTNM=' ' and INSTNM='Z' as both indications of missing values then perhaps you want this.
presence = not (instnm in (' ','Z')) ;
If you want stats on this binary variable you can use MIN to indicate if it is ever false, MAX to indicate if it is ever true and MEAN to indicate the percent that are true.
Otherwise perhaps you want to use it as a class variable and instead calculate statistics on some OTHER variables so you can compare if they are different between cases where PRESENCE is true or false?
Yeah, you're definitely mixing types here.
Is your instnm variable a character or numeric variable? I suspect it's a character based on what you're seeing.
See my comments in your code below:
@sastuck wrote:
data paper.arrests_schools;
set paper.arrests_schools;
presence=instnm; <- if instnm is character, then presence is created as a character variable.
if instnm=. then presence=1; <- numeric missing is a ., character missing is a space, or use if missing(var) which handles num and characters;
if instnm=Z then presence=0; <- this is a character check which shouldn't work. Is it supposed to special missing (.Z) or is it supposed to be the letter Z, or is it another variable named Z? Currently it would be checking for a variable, and you should see a note in the log about missing/uninitialized variables.
run;/*not working*/
PROC MEANS data=paper.arrests_schools n mean median std cv min max skew maxdec=2;
var presence; <-requires a numeric variable;
run;
I'm not sure what you expect when you take the median cv, std etc for a variable that is only 0/1....I suspect that's not what you're looking for, but I don't know what you're looking for either.
@sastuck wrote:
Hello, I set up a variable called "presence" which is 0 if there is no university and 1 if there is a university present. This data is merged by FIPS with crime data, and I am wondering how I can use proc means to create meaningful descriptive statistics. Unfortunately, I keep meeting the error message "variable presence in list does not match type prescribed for this list." Do you have any suggestions? I am new to sas and appreciate the help! Here's my progress so far:
data paper.arrests_schools;
set paper.arrests_schools;
presence=instnm;
if instnm=. then presence=1;
if instnm=Z then presence=0;
run;/*not working*/
PROC MEANS data=paper.arrests_schools n mean median std cv min max skew maxdec=2;
var presence;
run;
I appreciate your feedback--let me be a bit more specific (and yes, I am quite lost in terms of creating the dummy variable as well). The variable "instnm" is a character variable. I would like to be able to compare the crime rates of FIPS with institutions (1) and without institutions (0). Here is the code again after deleting my edits:
data paper.arrests_schools;
set paper.arrests_schools;
presence=instnm;
if instnm= then presence=1;
if instnm= then presence=0;
run;
Any further suggestions? I really appreciate the help
That doesn't incorporate any of the suggestions or comments I made. Or answer any of the questions.
@sastuck wrote:
I appreciate your feedback--let me be a bit more specific (and yes, I am quite lost in terms of creating the dummy variable as well). The variable "instnm" is a character variable. I would like to be able to compare the crime rates of FIPS with institutions (1) and without institutions (0). Here is the code again after deleting my edits:
data paper.arrests_schools;
set paper.arrests_schools;
presence=instnm;
if instnm= then presence=1;
if instnm= then presence=0;
run;
Any further suggestions? I really appreciate the help
At any rate, you don't want presence as a VAR, you want is a BY or CLASS list where the type doesn't matter.
You really, really shouldn't use the same data set name in the DATA and SET statement either. This erases your original data set and makes it hard to work with. So you first need to recreate your source data to work with this at all, assuming your source data is arrests_schools.
data paper.arrests_schools;
set paper.arrests_schools;
Add a prefix at the minimum so you can distinguish the data sets.
data paper.arrests_schools2;
set paper.arrests_schools;
Is INSTNM a numeric or character variable? This statement will create PRESENCE as the same type of variable that INSTNM is.
presence=instnm;
This line will treat INSTNM as if it is a numeric variable since you are comparing it to a numeric missing value.
if instnm=. then presence=1;
This line is comparing INSTNM to a variable named Z. What type of variable is Z?
if instnm=Z then presence=0;
If you just want to create a binary variable to indict whether INSTNM is missing or not why not just use the MISSING() function?
presence = not missing(instnm) ;
That way you don't care whether INSTNM is numeric or character.
Or if you want to treat INSTNM=' ' and INSTNM='Z' as both indications of missing values then perhaps you want this.
presence = not (instnm in (' ','Z')) ;
If you want stats on this binary variable you can use MIN to indicate if it is ever false, MAX to indicate if it is ever true and MEAN to indicate the percent that are true.
Otherwise perhaps you want to use it as a class variable and instead calculate statistics on some OTHER variables so you can compare if they are different between cases where PRESENCE is true or false?
Thank you.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.