Desktop productivity for business analysts and programmers

How to handle names generated when importing a file with headers in line 1?

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 79
Accepted Solution

How to handle names generated when importing a file with headers in line 1?

Hi, I am using EG6.1. I sometimes import plain txt files, that contain data from line 2 onwards. In line 1 there are names of colums. I can chose the import step to recognized these as variable names. If these names are non-compliant with SAS variable names, they will be converted during the import. My problem is how to reference these in later data steps in regular SAS programs.

For instance a column gets this name: 'Auto H2O2 data'n. I am writing it exactly as it is mentioned in the properties tab seen when right clicking the column heading in the imported data in the EG61 environment. What is the right way to access this variable?

And in general where in the documentation is this discussed?

Regards

Poul Ravn Sørensen


Accepted Solutions
Solution
‎02-12-2015 05:07 AM
SAS Employee
Posts: 340

Re: How to handle names generated when importing a file with headers in line 1?

Hi Poul,

By default EG does not convert variable names to the "V7 format". The column name is exatly the copy from the first row.

You can reference those column names in a data step with the same syntax you use in your original post.

For example:

data want;

     set have;/*result of import*/

     'Auto H2O2 data'n= 'Auto H2O2 data'n + 1;  /*using "named constant"to reference a column*/

run;

Does this code work for you? If not, what is the error message?

View solution in original post


All Replies
Grand Advisor
Posts: 17,393

Re: How to handle names generated when importing a file with headers in line 1?

Look at option validvarname in the documentation.

I recommend changing your default in EG to V7 compares to present value of ANY.

Frequent Contributor
Posts: 79

Re: How to handle names generated when importing a file with headers in line 1?

Hi Reeza, as you anticipated, the option was set to 'ANY', not V7. I tried to change it and run again an import step from an existing project. Now it gives lots of errors on account of the import step trying to name the columns according to the text (labels) in the first row of data in the flat file. (NB, it is not an Excel file, but a tab-delimited file, intended to imported into Excel originally, but for efficiency reasons now treated with SAS, which is much more powerful.)

So it comes back to the other part of the question: How to reference the variables that got names marked with 'my odd variable'n in the import step.

I tried to look in the documentation, but did not find the key for this , - yet.

Regards Poul

Esteemed Advisor
Posts: 6,685

Re: How to handle names generated when importing a file with headers in line 1?

Drop the original names and use valid standard SAS ("V7") names.

Read the file manually in a data step with firstobs=2.

That way you avoid all the hassle with the 'some shit in here'n constructs. And you get the additional benefit that the data step may/will fail if the file structure changes unexpectedly.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 79

Re: How to handle names generated when importing a file with headers in line 1?

Thanks for suggesting this.

However then names well then be just F1, F2 etc. I would be nice to able to take advantage of the fact that the variable names are - almost - available.

Is there a way to read the first line only and then use the strings there as variable names, with edits when the names contain illegal characters? Of course I could just have a data step with lots of renames but still, - not quite so elegant.

Regards

Poul

Esteemed Advisor
Posts: 6,685

Re: How to handle names generated when importing a file with headers in line 1?

You can do this to automate it:

Read the file, but only line 1 (obs=1).

Use the scan() function to extract the variable names from the line.

Use the translate function to get rid of the characters.

Then use call execute to generate a data step (that starts with firstobs=2) with the corrected variable names.

But you still need to get a grip on the problem of variable types and lengths.

Correctly documented infile formats save you all this. Sadly, tools like Excel prevent users from developing good data processing habits.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 79

Re: How to handle names generated when importing a file with headers in line 1?

So, do you see a need for a macro that grabs the variable names from reading line 1 for re-use when reading from line 2? I mean, reading from line 2 requires an input statement, which must specify which variables to read.

Or will what you suggest be able to do this without as macro string transfer mechanism?

Poul

Esteemed Advisor
Posts: 6,685

Re: How to handle names generated when importing a file with headers in line 1?

You write a data step that generates the data step:

data _null_;

call execute ("data want;infile 'path_to_infile' firstobs=2;input");

infile "path_to_infile" obs=1;

input;

do i = 1 to countw(_infile_,';');

  varname = scan(_infile_,i,';');

  varname = translate(varname,"_"," ");

  call execute(varname);

end;

call execute(';run;');

run;

(This is just out of my head, so completely untested, but it documents the idea)

But I still recommend against such tries at automatics. Have a solid documentation of the infile, and import according to that.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Solution
‎02-12-2015 05:07 AM
SAS Employee
Posts: 340

Re: How to handle names generated when importing a file with headers in line 1?

Hi Poul,

By default EG does not convert variable names to the "V7 format". The column name is exatly the copy from the first row.

You can reference those column names in a data step with the same syntax you use in your original post.

For example:

data want;

     set have;/*result of import*/

     'Auto H2O2 data'n= 'Auto H2O2 data'n + 1;  /*using "named constant"to reference a column*/

run;

Does this code work for you? If not, what is the error message?

Frequent Contributor
Posts: 79

Re: How to handle names generated when importing a file with headers in line 1?

Hi Gergely,

the suggested notation works. I can use the variable name as shown in an assignment state, where a new variable is equated with the one referenced 'xxxxx'n. So it is recognized. I think this will work in general.

Thanks

Poul

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 461 views
  • 6 likes
  • 4 in conversation