import tab file with column names at second row

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

import tab file with column names at second row

Hi all,

I have a tab file with column names at second row, and data start from the third row. How can I import this file correctly?

I have the following code, which gives the column names as the first row :

proc import datafile='C:\Users\esi15573\Documents\BOLDLF\D1C21D.tab'

     out=SIM

     dbms=tab

     replace;

     GETNAMES= Y;

     datarow=3;

run;


How could I revise it to get it what I want?


Thanks,


York


Accepted Solutions
Solution
‎07-02-2014 10:26 AM
Frequent Contributor
Posts: 114

Re: import tab file with column names at second row

Hello York,

your should be fine with the solution in this Thread.

So for your example, just delete the first row and then start with the second:

filename exTemp1 temp;

data _null_;

infile 'C:\Users\esi15573\Documents\BOLDLF\D1C21D.tab' firstobs=2;

file exTemp1;

input;

put _infile_;

run;

PROC IMPORT DATAFILE=exTemp1

  OUT=SIM

  DBMS=TAB  REPLACE;

  GETNAMES=YES;

RUN;

Let us know if you succeeded :-)

Cheers, Michael

View solution in original post


All Replies
Solution
‎07-02-2014 10:26 AM
Frequent Contributor
Posts: 114

Re: import tab file with column names at second row

Hello York,

your should be fine with the solution in this Thread.

So for your example, just delete the first row and then start with the second:

filename exTemp1 temp;

data _null_;

infile 'C:\Users\esi15573\Documents\BOLDLF\D1C21D.tab' firstobs=2;

file exTemp1;

input;

put _infile_;

run;

PROC IMPORT DATAFILE=exTemp1

  OUT=SIM

  DBMS=TAB  REPLACE;

  GETNAMES=YES;

RUN;

Let us know if you succeeded :-)

Cheers, Michael

Occasional Contributor
Posts: 8

Re: import tab file with column names at second row

Hi mfab,

Thanks. The first part of code works to delete the first observation. but the second part of the code did not work. the generated file has all the columns clustered into one column.

Any idea how to revise it?

Thanks

York

Frequent Contributor
Posts: 114

Re: import tab file with column names at second row

Hi York,

honestly, I don't know, why that should not work.

I tested it with a little file here myself and it seemed to be working properly.

Are you sure, that your file is formatted correctly and delimited with tabs?

Michael

Occasional Contributor
Posts: 8

Re: import tab file with column names at second row

Hi Michael,

Thanks. It might be the tab file problem. Once I converted it to the csv file, and ran again, it works well.

Thanks for the help.

York

Super User
Super User
Posts: 7,392

Re: import tab file with column names at second row

Just leaving so quick thought, can you not just modify the first the datastep to read the data

filename exTemp1 temp;

data _null_;

infile 'C:\Users\esi15573\Documents\BOLDLF\D1C21D.tab' firstobs=2 dsd missover;

     input     VAR1 $

                  VAR2 $...

run;

You have far more control over the import that way than using proc import.  Do a search here as there are loads of examples of importing delimited files.

Frequent Contributor
Posts: 114

Re: import tab file with column names at second row

Hey RW9,

I absolutely second your posting. Proc Import leaves you little control sometimes.

With my example I merely adapted the code from the Thread, I quoted ;-)

Have a great time, everyone!

Michael

Occasional Contributor
Posts: 8

Re: import tab file with column names at second row

HI RW9,

Thanks for the suggestion. The reason I prefer Proc import is that it's simpler, whereas input would need to type into the variable names and I have a lot of variables in the file.

Also, I have trouble to define the formats of the variables.

see the following values as example, what's the correct format?

1.17E+00
-1.22E-01

I tried e8. and e9., but each only recognize certain columns but not others.

Thanks,

York

Frequent Contributor
Posts: 114

Re: import tab file with column names at second row

Hi York,

e9. looks fine to me.

I would suggest, you inspect a few lines more closely in Order to find out if there is anything special with the lines/numbers that are not recognized properly.

If you don't succeed, you might want to post an example of the file and your code, as well as the wrong and desired output, so we could have a look at it.

Cheers,

Michael

p.s.: you could try to add TERMSTR=CRLF to your infile statement (or TERMSTR=CR or TERMSTR=LF), depending on whether you receive and process your files on unix or windows environments. This might help with certain lines.

Super User
Super User
Posts: 7,392

Re: import tab file with column names at second row

Hi,

Yes, proc import is simpler and with that you forfeit most of the control over it.  Therefore you will likely generate more work for yourself further down the line.

Per mfab, I would agree, its likely that odd data within the file to be imported.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 2333 views
  • 1 like
  • 3 in conversation