Hi all,
I have a tab file with column names at second row, and data start from the third row. How can I import this file correctly?
I have the following code, which gives the column names as the first row :
proc import datafile='C:\Users\esi15573\Documents\BOLDLF\D1C21D.tab'
out=SIM
dbms=tab
replace;
GETNAMES= Y;
datarow=3;
run;
How could I revise it to get it what I want?
Thanks,
York
Hello York,
your should be fine with the solution in this Thread.
So for your example, just delete the first row and then start with the second:
filename exTemp1 temp;
data _null_;
infile 'C:\Users\esi15573\Documents\BOLDLF\D1C21D.tab' firstobs=2;
file exTemp1;
input;
put _infile_;
run;
PROC IMPORT DATAFILE=exTemp1
OUT=SIM
DBMS=TAB REPLACE;
GETNAMES=YES;
RUN;
Let us know if you succeeded 🙂
Cheers, Michael
Hello York,
your should be fine with the solution in this Thread.
So for your example, just delete the first row and then start with the second:
filename exTemp1 temp;
data _null_;
infile 'C:\Users\esi15573\Documents\BOLDLF\D1C21D.tab' firstobs=2;
file exTemp1;
input;
put _infile_;
run;
PROC IMPORT DATAFILE=exTemp1
OUT=SIM
DBMS=TAB REPLACE;
GETNAMES=YES;
RUN;
Let us know if you succeeded 🙂
Cheers, Michael
Hi mfab,
Thanks. The first part of code works to delete the first observation. but the second part of the code did not work. the generated file has all the columns clustered into one column.
Any idea how to revise it?
Thanks
York
Hi York,
honestly, I don't know, why that should not work.
I tested it with a little file here myself and it seemed to be working properly.
Are you sure, that your file is formatted correctly and delimited with tabs?
Michael
Hi Michael,
Thanks. It might be the tab file problem. Once I converted it to the csv file, and ran again, it works well.
Thanks for the help.
York
Just leaving so quick thought, can you not just modify the first the datastep to read the data
filename exTemp1 temp;
data _null_;
infile 'C:\Users\esi15573\Documents\BOLDLF\D1C21D.tab' firstobs=2 dsd missover;
input VAR1 $
VAR2 $...
run;
You have far more control over the import that way than using proc import. Do a search here as there are loads of examples of importing delimited files.
Hey RW9,
I absolutely second your posting. Proc Import leaves you little control sometimes.
With my example I merely adapted the code from the Thread, I quoted 😉
Have a great time, everyone!
Michael
HI RW9,
Thanks for the suggestion. The reason I prefer Proc import is that it's simpler, whereas input would need to type into the variable names and I have a lot of variables in the file.
Also, I have trouble to define the formats of the variables.
see the following values as example, what's the correct format?
1.17E+00 |
-1.22E-01 |
I tried e8. and e9., but each only recognize certain columns but not others.
Thanks,
York
Hi York,
e9. looks fine to me.
I would suggest, you inspect a few lines more closely in Order to find out if there is anything special with the lines/numbers that are not recognized properly.
If you don't succeed, you might want to post an example of the file and your code, as well as the wrong and desired output, so we could have a look at it.
Cheers,
Michael
p.s.: you could try to add TERMSTR=CRLF to your infile statement (or TERMSTR=CR or TERMSTR=LF), depending on whether you receive and process your files on unix or windows environments. This might help with certain lines.
Hi,
Yes, proc import is simpler and with that you forfeit most of the control over it. Therefore you will likely generate more work for yourself further down the line.
Per mfab, I would agree, its likely that odd data within the file to be imported.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.