Hello
I want to import tst file with delimiter #
proc import file='/path'
out=work.shoes
dbms='#';
run;
I get an error
ERROR: DBMS type # not valid for import.
What is the way to define that the delimiter is # ( sladder)
So you need to figure out WHY some of the observations are split between two records.
If you are lucky it is just because when the file was written they used too short of a setting for the LRECL options and sometimes the last few fields were written to a new line. In that case try reading the file with the FLOWOVER option instead of TRUNCOVER (or MISSOVER) option and perhaps it will automatically jump to the next line properly.
If you are unlucky it is because there were end of line characters in the value of one of the character variables in the observation. That is harder for SAS to read and you will probably need to fix the file first.
Again if you are lucky the actual lines are terminated with CR+LF pair and the embedded line breaks are just single CR or single LF characters. That you can fix by using the TERMSTR=CRLF option on the infile statement.
Next level is that the fields that contain the end of line characters are quoted. Then you can use a method like in this macro to create a version of the file that can be parsed. https://github.com/sasutils/macros/blob/master/replace_crlf.sas
If the fields with the end of line characters are not quoted then perhaps you can fix it by counting the number of delimiters. There are examples on this forum of that solution also.
The correct way to do it is this:
proc import file='/path'
out=work.shoes
dbms=dlm;
delimiter='#';
run;
The code you provided is not working well, I get one column only
This code was created by using import wizard and it works well except of some problems:
1-IF the observation is written on 2 rows (instead of 1 ) then It doesnt take the data correctly.
May you please explain why your code is not working well?
How can you adjust your code please?
thanks
DATA WORK.kelet;
obs=_n_;
LENGTH
F1 $ 55
F2 8
F3 8
F4 8
F5 8
F6 8
F7 8
F8 8
F9 8
F10 $ 26
F11 8
F12 8
F13 $ 65
F14 $ 14
F15 $ 90
F16 $ 81
F17 8
F18 $ 1 ;
FORMAT
F1 $CHAR55.
F2 BEST9.
F3 BEST9.
F4 BEST4.
F5 DDMMYY10.
F6 DDMMYY10.
F7 BEST17.
F8 BEST16.
F9 BEST3.
F10 $CHAR26.
F11 BEST4.
F12 BEST9.
F13 $CHAR65.
F14 $CHAR14.
F15 $CHAR90.
F16 $CHAR81.
F17 BEST9.
F18 $CHAR1. ;
INFORMAT
F1 $CHAR55.
F2 BEST9.
F3 BEST9.
F4 BEST4.
F5 DDMMYY10.
F6 DDMMYY10.
F7 BEST17.
F8 BEST16.
F9 BEST3.
F10 $CHAR26.
F11 BEST4.
F12 BEST9.
F13 $CHAR65.
F14 $CHAR14.
F15 $CHAR90.
F16 $CHAR81.
F17 BEST9.
F18 $CHAR1. ;
INFILE 'path'
LRECL=285
ENCODING="HEBREW"
TERMSTR=CRLF
DLM='7F'x
MISSOVER
DSD ;
INPUT
F1 : $CHAR55.
F2 : ?? BEST9.
F3 : ?? BEST9.
F4 : ?? BEST4.
F5 : ?? DDMMYY10.
F6 : ?? DDMMYY10.
F7 : ?? COMMA17.
F8 : ?? COMMA16.
F9 : ?? BEST3.
F10 : $CHAR26.
F11 : ?? BEST4.
F12 : ?? BEST9.
F13 : $CHAR65.
F14 : $CHAR14.
F15 : $CHAR90.
F16 : $CHAR81.
F17 : ?? BEST9.
F18 : $CHAR1. ;
RUN;
Hello @Ronein, you can read a file with custom delimiter using the data step's infile statement. See below a simple example. I uploaded a dummy text file into SAS Studio and read in via data step. Note I added a header which I ignore in the infile statement by setting which observation to start at.
*Read inputs from .txt file with # delimiter;
data work.example;
length cat1 cat2 $200 num1 3. ;
infile '/home/mydata/quick_example.txt'
delimiter='#' dsd truncover firstobs=2;
input cat1 cat2 num1;
run;
*Print example output;
proc print data=work.example;title 'Parsed Infile with # Delimiter';run;
Hope this helps answer your question.
Thanks
Harry
Hi @Ronein please can you share a few lines of dummy data as an example to the data you're trying to import? Using the delimiter option in the infile the data step expects the same number of columns defined on each row. If your data is very messy you can alternatively read the lines into a single row and then subset items using scan() for example, I've done this previously where some rows only contain a subset of the total expected columns.
Thanks
Harry
So you need to figure out WHY some of the observations are split between two records.
If you are lucky it is just because when the file was written they used too short of a setting for the LRECL options and sometimes the last few fields were written to a new line. In that case try reading the file with the FLOWOVER option instead of TRUNCOVER (or MISSOVER) option and perhaps it will automatically jump to the next line properly.
If you are unlucky it is because there were end of line characters in the value of one of the character variables in the observation. That is harder for SAS to read and you will probably need to fix the file first.
Again if you are lucky the actual lines are terminated with CR+LF pair and the embedded line breaks are just single CR or single LF characters. That you can fix by using the TERMSTR=CRLF option on the infile statement.
Next level is that the fields that contain the end of line characters are quoted. Then you can use a method like in this macro to create a version of the file that can be parsed. https://github.com/sasutils/macros/blob/master/replace_crlf.sas
If the fields with the end of line characters are not quoted then perhaps you can fix it by counting the number of delimiters. There are examples on this forum of that solution also.
You tell the INPUT statement to go to a new line when reading a delimited file the same way you would when reading any text file.
The character / in an input statement means to go to a new line. So if you text file is structured to have three fields on the first line and two on the second then your code to read it might look like this:
data want;
infile 'myfile.txt' dlm='#' ;
length var1 var2 8 var3 $20 var4 $10 var5 8 var6 $30 ;
input var1 var2 var3 / var4 var 5;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.