Hi everyone,
I am trying to import a pipe delimited .txt file with 5000 observations and 89 variables into SAS. There are a couple of issues with this data that I am trying to work around:
Here is the code that I am trying to work with:
proc import out = data_import datafile = "&filepath/2018_JAN_5000.txt" dbms = DLM replace; delimiter = "|"; getnames = NO; DATAROW = 1; run; proc sql; select sas_variable_name into: variable_names separated by " " from variable_order ; quit; %put &variable_names.; data data_import_2; set data_import; retain &variable_names.; run;
Thanks in advance for your help!
ap1994
I wouldn't bother to use PROC IMPORT to read a delimited file.
The shell of a program to read a delimited file can be as simple as this.
data want;
infile "&filepath/2018_JAN_5000.txt" dsd dlm='|' truncover ;
length var1 8 var2 $20 ..... var_last 8 ;
informat datevar mmddyy.;
format datevar date9.;
input var1 -- var_last ;
run;
So just copy copy the names from your sheet and paste them into your program in the LENGTH statement. Make some guesses about whether the variable is number (length of 😎 or character (length starts with $). If unsure just set it to something like $50 and check the values after you have it in a dataset.
1. The example provided imports fine for me using your code
2. You could change the file using SAS code as a pre-processing step rather than doing it manually
3. An alternative is to reuse the data step created by proc import and run that instead of the proc, after adding the option termstr= .
When I review the flle you posted using this website preview function it looks like two lines of data to me.
Are you sure the file is not just use LF only as the end of line (as is normal on Unix) instead of CR and LF (as it normal on Windows)?
Also you still occasionally see files that use CR only as the end of line (as was used on the original Apple Mac computers before they switched MacOS to a version of Unix).
You can use SAS to check by reading the beginning of the file as a fixed length records. Then the actual end of line characters will be read as part of the data and will show in the log. This program will read the first 500 bytes of the file and dump it to the SAS log.
data _null_;
infile ""&filepath/2018_JAN_5000.txt" recfm=50 obs=10;
input;
list;
run;
I wouldn't bother to use PROC IMPORT to read a delimited file.
The shell of a program to read a delimited file can be as simple as this.
data want;
infile "&filepath/2018_JAN_5000.txt" dsd dlm='|' truncover ;
length var1 8 var2 $20 ..... var_last 8 ;
informat datevar mmddyy.;
format datevar date9.;
input var1 -- var_last ;
run;
So just copy copy the names from your sheet and paste them into your program in the LENGTH statement. Make some guesses about whether the variable is number (length of 😎 or character (length starts with $). If unsure just set it to something like $50 and check the values after you have it in a dataset.
Just opened the file you posted with notepad++ and the file has linux/unix line breaks, so using a data step to read the file should be possible, if you add
termstr=LF
to the infile-statement.
@ChrisNZ already suggested this 😉
Thanks everyone for your help! I ended up importing the file using a data step with termstr = LF as an option.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.