The result required is:
the dataset is in attachment, and my codes are
DATA Names;
DATA Names; INFILE 'D:\names.txt' dlm=';'; IF gender= "M" THEN gender_dummy= 1; eLSE IF gender= "F" THEN gender_dummy= 0; IF inservice < 2013 THEN bestpos=q1; ELSE bestpos=1; worstpos=q3; RUN;
my result is
As I think, the first 6 variables are original variables and we could just read them in and tell the SAS to seperate them by delimeter ";", and the last 4 variables are newly created variables. Except for the variable "serviceyrs", I can't deduce how it come from, so we can just ignore this column. It is weird that the orignal variables can't be correctly read in, and the final result is totally incorrect. Could anyone know what's the problem with my codes?
The problem with your code is obvious. You need an INPUT statement. Your assumption about thsi being automatic due to that first line in your code is wrong. It is PROC IMPORT that can do that; not the datasytep.
Remember to skip that first line. Use INFILE ... FIRSTOBS=2;
Regards, Jan.
I can't see the INPUT statement. Did you forget something important? 😉
regards Jan.
I would look to use the import wizard or task for a text file. Tell the wizard the delimiter is ;. Should take care of reading.
If you want to add additional variables at the same time you should be able to find the datastep generated by the wizard in the log. You could copy and paste that into the editor/code node or which ever and add the lines you have for the additional variables.
The problem with your code is obvious. You need an INPUT statement. Your assumption about thsi being automatic due to that first line in your code is wrong. It is PROC IMPORT that can do that; not the datasytep.
Remember to skip that first line. Use INFILE ... FIRSTOBS=2;
Regards, Jan.
Your code would be (not tested):
DATA Names;
length name $100 gender $1 inservice q1 q2 q3 8;
INFILE 'D:\names.txt' dlm=';' fitrsobs=2;
input name $ gender $ inservice q1 q2 q3;
...
Hope this helps,
- Jan.
If you are worried about multiple INFORMAT and FORMAT statements that is NOT too many lines. You only would have a dozen or so of those and one input statement spanning 8 or nine lines. A 40 or 50 line program to read data when YOU don't have to type all of it is minor.
I routinely deal with this for up to 200 variables. I end up with over 800 lines for some of these things. Adding Labels so the variables have some meaning, diagnostic code, cleaning values and adding additional analysis or reporting variables adds more. That is all routine and tedious but the programs are not "too big".
The idea is that you can reduce them in the editor, many of the formats especially for character variables for instance.
But once you have the program when you need to read a similar file later you just point to the new data source and give the data set a new name.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.