BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ap1994
Calcite | Level 5

Hi everyone,

 

I am trying to import a pipe delimited .txt file with 5000 observations and 89 variables into SAS. There are a couple of issues with this data that I am trying to work around:

  1. Between observations, there is no space or carriage return. This means that between the value for the last variable of a particular observation and the value for the first variable of the next observation, there is no space or carriage return to indicate that what follows is a new observation. SAS does import some of the observations correctly, however, there are a few cases where SAS is unable to parse the first variable and thus returns a missing observation. Is there a way I can work around this issue without going into the text file and manually adding a carriage return?
  2. There is no header line with variable names, therefore SAS imports the data with generic variable names (i.e. VAR1, VAR2, etc.). I have imported a list of variable names that I created in Excel based on a SQL query and assigned them to a macro variable. Is there a way I can use this macro variable of the variable names (with a single space between names) to assign the variable names?

Here is the code that I am trying to work with:

proc import out = data_import
	datafile = "&filepath/2018_JAN_5000.txt"
	dbms = DLM replace;
	delimiter = "|";
	getnames = NO;
	DATAROW = 1;
run;

proc sql;
	select sas_variable_name
	into: variable_names separated by " "
	from variable_order
	;
quit;
%put &variable_names.;

data data_import_2;
	set data_import;
	retain &variable_names.;
run;

Thanks in advance for your help!

 ap1994

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

I wouldn't bother to use PROC IMPORT to read a delimited file.

 

The shell of a program to read a delimited file can be as simple as this.

data want;
  infile "&filepath/2018_JAN_5000.txt" dsd dlm='|' truncover ;
  length var1 8 var2 $20 ..... var_last 8 ;
  informat datevar mmddyy.;
  format datevar date9.;
  input var1 -- var_last ;
run;

So just copy copy the names from your sheet and paste them into your program in the LENGTH statement. Make some guesses about whether the variable is number (length of 😎 or character (length starts with $).  If unsure just set it to something like $50 and check the values after you have it in a dataset.

View solution in original post

5 REPLIES 5
ChrisNZ
Tourmaline | Level 20

1. The example provided imports fine for me using your code

 

2. You could change the file using SAS code as a pre-processing step rather than doing it manually

 

3. An alternative is to reuse the data step created by proc import and run that instead of the proc, after adding the option termstr= .

 

Tom
Super User Tom
Super User

When I review the flle you posted using this website preview function it looks like two lines of data to me.  

Are you sure the file is not just use LF only as the end of line (as is normal on Unix) instead of CR and LF (as it normal on Windows)?

Also you still occasionally see files that use CR only as the end of line (as was used on the original Apple Mac computers before they switched MacOS to a version of Unix).

 

You can use SAS to check by reading the beginning of the file as a fixed length records.  Then the actual end of line characters will be read as part of the data and will show in the log.  This program will read the first 500 bytes of the file and dump it to the SAS log.

data _null_;
  infile ""&filepath/2018_JAN_5000.txt" recfm=50 obs=10;
  input;
  list;
run;
Tom
Super User Tom
Super User

I wouldn't bother to use PROC IMPORT to read a delimited file.

 

The shell of a program to read a delimited file can be as simple as this.

data want;
  infile "&filepath/2018_JAN_5000.txt" dsd dlm='|' truncover ;
  length var1 8 var2 $20 ..... var_last 8 ;
  informat datevar mmddyy.;
  format datevar date9.;
  input var1 -- var_last ;
run;

So just copy copy the names from your sheet and paste them into your program in the LENGTH statement. Make some guesses about whether the variable is number (length of 😎 or character (length starts with $).  If unsure just set it to something like $50 and check the values after you have it in a dataset.

andreas_lds
Jade | Level 19

Just opened the file you posted with notepad++ and the file has linux/unix line breaks, so using a data step to read the file should be possible, if you add

termstr=LF

to the infile-statement.

@ChrisNZ already suggested this 😉

 

ap1994
Calcite | Level 5

Thanks everyone for your help! I ended up importing the file using a data step with termstr = LF as an option.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 3989 views
  • 0 likes
  • 4 in conversation