Actually you have a typo, you're missing an s in your variable name. Short variable names are your friend. I copied/pasted the variable name because I'm lazy and have made those mistakes too many times.
age_first_pres_dayS_num
@Reeza Well that was hours of my life. 😞 Thank you for your help.
@kristiepauly the data set you are using has lots of age variables which are all character. I don't know why they are all character, but ideally they should be numeric. Likely, something in the process of creating and storing this data has gone awry and will cause you lots of extra work — better to put that extra work into fixing the process so the variables are numeric which will be much less work in the long run.
Allow me to guess why you are getting character values for these age variables.
if _N_ < 2 then delete;
My guess is that you are getting this data from Excel, the first row is the variable names (which are going to be character) and this causes the variable to be character. If this is the case, it is easy to fix so that the first row is actual data and the variables will be numeric. If that is the case, let me know and I can explain how to fix this. But I'm not going to bother to explain unless that is the problem.
@kristiepauly thank you for marking my message correct, but I think @Reeza was referring to this message: https://communities.sas.com/t5/SAS-Programming/Proc-Format-issues/m-p/826225#M326347
Yes, please! Any help is much appreciated. The first row is variables, the second row was variable explanation which was showing as an observation when I ran proc contents. I found the _N_<2 solution on this community board and assumed I would need to change it _N_<3, given my real data started there. However, _N_<2 fixed that immediate problem but I'm assuming it is causing this new problem you mentioned?
Thanks again for your help. 💛
Is the original file Excel or CSV? If it's CSV this is easily fixed, if Excel a bit harder.
It is definitely better to read and format the data correctly from the start, apply formats and ensure numeric variables are numeric and character are character. But cleaning the data is not the part anyone wants to start with.
@kristiepauly wrote:
Yes, please! Any help is much appreciated. The first row is variables, the second row was variable explanation which was showing as an observation when I ran proc contents. I found the _N_<2 solution on this community board and assumed I would need to change it _N_<3, given my real data started there. However, _N_<2 fixed that immediate problem but I'm assuming it is causing this new problem you mentioned? 💛
If you are using PROC IMPORT, and the first row in EXCEL is the variable names, you need to use the proper PROC IMPORT statement.
proc import datafile="yourExcelFileName.xlsx" dbms=excel replace out=sasdatasetname;
getnames=YES;
run;
With the GETNAMES statement, the first row is considered the name of the variable (it is not considered data) and all the other rows contain data, and so the variable will be numeric if all those other rows contain numbers or missings (but not NA).
If the NAME is in the first row of an XLSX then PROC IMPORT can read it.
To have it skip the second row and read the data from the third row use the DATAROW statement.
proc import datafile='c:\downloads\extra_row.xlsx' dbms=xlsx out=test replace;
getnames=Y;
datarow=3;
run;
Input:
Output;
@kristiepauly thank you for marking my message correct, but you still have not marked the correct message — I think @Reeza was referring to this message: https://communities.sas.com/t5/SAS-Programming/Proc-Format-issues/m-p/826225#M326347
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.