BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ronein
Onyx | Level 15

Hello

I want to import tst file with delimiter #

proc import file='/path'
    out=work.shoes
    dbms='#';
run;

I get an error 

ERROR: DBMS type # not valid for import.

 

 

What is the way to define that the delimiter is # ( sladder)

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

So you need to figure out WHY some of the observations are split between two records.

 

If you are lucky it is just because when the file was written they used too short of a setting for the LRECL options and sometimes the last few fields were written to a new line.  In that case try reading the file with the FLOWOVER option instead of TRUNCOVER (or MISSOVER) option and perhaps it will automatically jump to the next line properly.

 

If you are unlucky it is because there were end of line characters in the value of one of the character variables in the observation.  That is harder for SAS to read and you will probably need to fix the file first.

 

Again if you are lucky the actual lines are terminated with CR+LF pair and the embedded line breaks are just single CR or single LF characters.  That you can fix by using the TERMSTR=CRLF option on the infile statement.

 

Next level is that the fields that contain the end of line characters are quoted.  Then you can use a method like in this macro to create a version of the file that can be parsed.  https://github.com/sasutils/macros/blob/master/replace_crlf.sas

 

If the fields with the end of line characters are not quoted then perhaps you can fix it by counting the number of delimiters.  There are examples on this forum of that solution also.

View solution in original post

8 REPLIES 8
s_lassen
Meteorite | Level 14

The correct way to do it is this:

proc import file='/path'
    out=work.shoes
    dbms=dlm;
delimiter='#';
run;
Ronein
Onyx | Level 15

The code you provided is not working well, I get one column only

This code was created by using import wizard and it works well except of some problems:

1-IF the observation is written on 2 rows (instead of 1 ) then It doesnt take the data correctly.

May you please explain why your code is not working  well?

How can you adjust your code please?

thanks

 

DATA WORK.kelet;
obs=_n_;
    LENGTH
        F1               $ 55
        F2                 8
        F3                 8
        F4                 8
        F5                 8
        F6                 8
        F7                 8
        F8                 8
        F9                 8
        F10              $ 26
        F11                8
        F12                8
        F13              $ 65
        F14              $ 14
        F15              $ 90
        F16              $ 81
        F17                8
        F18              $ 1 ;
    FORMAT
        F1               $CHAR55.
        F2               BEST9.
        F3               BEST9.
        F4               BEST4.
        F5               DDMMYY10.
        F6               DDMMYY10.
        F7               BEST17.
        F8               BEST16.
        F9               BEST3.
        F10              $CHAR26.
        F11              BEST4.
        F12              BEST9.
        F13              $CHAR65.
        F14              $CHAR14.
        F15              $CHAR90.
        F16              $CHAR81.
        F17              BEST9.
        F18              $CHAR1. ;
    INFORMAT
        F1               $CHAR55.
        F2               BEST9.
        F3               BEST9.
        F4               BEST4.
        F5               DDMMYY10.
        F6               DDMMYY10.
        F7               BEST17.
        F8               BEST16.
        F9               BEST3.
        F10              $CHAR26.
        F11              BEST4.
        F12              BEST9.
        F13              $CHAR65.
        F14              $CHAR14.
        F15              $CHAR90.
        F16              $CHAR81.
        F17              BEST9.
        F18              $CHAR1. ;
    INFILE 'path'
        LRECL=285
        ENCODING="HEBREW"
        TERMSTR=CRLF
        DLM='7F'x
        MISSOVER
        DSD ;
    INPUT
        F1               : $CHAR55.
        F2               : ?? BEST9.
        F3               : ?? BEST9.
        F4               : ?? BEST4.
        F5               : ?? DDMMYY10.
        F6               : ?? DDMMYY10.
        F7               : ?? COMMA17.
        F8               : ?? COMMA16.
        F9               : ?? BEST3.
        F10              : $CHAR26.
        F11              : ?? BEST4.
        F12              : ?? BEST9.
        F13              : $CHAR65.
        F14              : $CHAR14.
        F15              : $CHAR90.
        F16              : $CHAR81.
        F17              : ?? BEST9.
        F18              : $CHAR1. ;
RUN;

HarrySnart
SAS Employee

Hello @Ronein, you can read a file with custom delimiter using the data step's infile statement. See below a simple example. I uploaded a dummy text file into SAS Studio and read in via data step. Note I added a header which I ignore in the infile statement by setting which observation to start at. 

 

HarrySnart_0-1636970623047.png

 

*Read inputs from .txt file with # delimiter;
data work.example;
length cat1 cat2 $200 num1 3. ;
infile '/home/mydata/quick_example.txt'
delimiter='#' dsd truncover firstobs=2;
input cat1 cat2 num1;
run;

*Print example output;
proc print data=work.example;title 'Parsed Infile with # Delimiter';run;

HarrySnart_1-1636970822187.png

Hope this helps answer your question.

 

Thanks

Harry

Ronein
Onyx | Level 15
Thanks,
I found that in some observations it happended that each observations is in 2 rows (instead of 1).
What is the way to tell SAS to continue add data to the observation?
I know the using of @@ ut may you show how to use it in the following :
/****A*****/
data work.example;
length cat1 cat2 $200 num1 3. ;
infile '/home/mydata/quick_example.txt'
delimiter='#' dsd truncover firstobs=2;
input cat1 cat2 num1;
run;

/****B***************/
proc import file='/path'
out=work.shoes
dbms=dlm;
delimiter='#';
run;
HarrySnart
SAS Employee

Hi @Ronein please can you share a few lines of dummy data as an example to the data you're trying to import? Using the delimiter option in the infile the data step expects the same number of columns defined on each row. If your data is very messy you can alternatively read the lines into a single row and then subset items using scan() for example, I've done this previously where some rows only contain a subset of the total expected columns.

 

Thanks

Harry

Ronein
Onyx | Level 15
Thanks.
The source data file (txt with # delimiter) has 2.5 million observations.
In most of cases each observation is located in 1 row.
But, unfortunately sometimes the observation is located in 2 rows .
In such cases using proc report wizard was not good because sas created 2 observations instead of 1( please see before the code was generated by import wizard).
My question is how to solve this problem via the code I provided before.thanks
Tom
Super User Tom
Super User

So you need to figure out WHY some of the observations are split between two records.

 

If you are lucky it is just because when the file was written they used too short of a setting for the LRECL options and sometimes the last few fields were written to a new line.  In that case try reading the file with the FLOWOVER option instead of TRUNCOVER (or MISSOVER) option and perhaps it will automatically jump to the next line properly.

 

If you are unlucky it is because there were end of line characters in the value of one of the character variables in the observation.  That is harder for SAS to read and you will probably need to fix the file first.

 

Again if you are lucky the actual lines are terminated with CR+LF pair and the embedded line breaks are just single CR or single LF characters.  That you can fix by using the TERMSTR=CRLF option on the infile statement.

 

Next level is that the fields that contain the end of line characters are quoted.  Then you can use a method like in this macro to create a version of the file that can be parsed.  https://github.com/sasutils/macros/blob/master/replace_crlf.sas

 

If the fields with the end of line characters are not quoted then perhaps you can fix it by counting the number of delimiters.  There are examples on this forum of that solution also.

Tom
Super User Tom
Super User

You tell the INPUT statement to go to a new line when reading a delimited file the same way you would when reading any text file. 

 

The character / in an input statement means to go to a new line.  So if you text file is structured to have three fields on the first line and two on the second then your code to read it might look like this:

data want;
   infile 'myfile.txt' dlm='#' ;
   length var1 var2 8 var3 $20 var4 $10 var5 8 var6 $30 ;
   input var1 var2 var3 / var4 var 5;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2353 views
  • 2 likes
  • 4 in conversation