BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Ata62439
Calcite | Level 5
Hi All,
Currently I'm working to import data that contain (",") as a delimiter.
Since the source data is coming from txt format then I'm using infile and setting the format, and it works.
But apparently, some data has a "fake" delimiter which it should be not counted as a delimiter value.
Can you please to help me to solve the issue?

Here's the sample data:
Data test;
Delimiter=' , ' ;
Input No Company Type;
Datalines;
1,Barata,PT,A
2,Riley,A
3,,B
;
Run;

The expectation Barata,PT shouldn't being divided into 2 cells.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Your code as written cannot work since all you did was define a variable named DELIMITER that has ',' as the value.

 

To change the delimiter used when reading lines you need to use an INFILE statement.  You can use INFILE DATALINES to read from the in-line data.

 

But for the example lines you showed that will still not work because the line does not have quotes around the embedded delimiter. (What you called a fake delimiter).

 

If you use the DSD option then SAS will allow quotes around a value to hide the "fake" delimiters and automatically remove those quotes from the value.   The DSD option will also make comma the default delimiter.  And it will also treat adjacent delimiters (like in your last example line) as indicating a missing value.

 

So if you can fix your source data to be properly formatted then code like this will work.

data test;
  infile datalines dsd truncover;
  Input No Company :$20. Type :$1.;
Datalines;
1,"Barata,PT",A
2,Riley,A
3,,B
;

If you cannot fix source data to properly quote the embedded delimiters it will be extremely difficult to parse the lines into the fields you want to read.  If , as in your example, there is only ONE field that could possibly have the embedded delimiter then you should be able to read it with some extra logic.

 

Here is one way based on knowing that you expect three fields on each line:

data test2;
  infile datalines dsd truncover;
  length No 8 Company $20 Type $1 ;
  input No Company :$20. @ ;
  do counter=1 to countw(_infile_,',','q')-3 ;
    input extra :$20. @;
    company=catx(',',company,extra);
  end;
  input Type :$1.;
  drop counter extra;
datalines;
1,Barata,PT,A
2,Riley,A
3,,B
;

View solution in original post

5 REPLIES 5
Patrick
Opal | Level 21

Ideally such strings are in quotes. For a .csv that's "standard".

Here two options - if neither of them work then you need to tells us in detail how we could identify "fake" delimiters

data demo;
  infile datalines dsd truncover;
  Input No Company :$20. Type :$2.;
Datalines;
1,"Barata,PT",A
2,Riley,A
3,,B
;

data demo;
  infile datalines dsd truncover;
  Input @;
  if countc(_infile_,',')=2 then
    do;
      input No Company :$20. Type :$2.;
    end;
  else
    do;
      input No _c1 :$20. _c2 :$20. Type :$2.;
      Company=catx(',',_c1,_c2);
      drop _c:;
    end;
Datalines;
1,Barata,PT,A
2,Riley,A
3,,B
; 

 

Ata62439
Calcite | Level 5
Yes agree, ideally there's should be a quote string. And this quote string has been in place on typing the address. As for company or other field name (which I'm not identified it yet) doesn't have any quote string at all.
But I get your point, thank you for helping and giving me an insight
Tom
Super User Tom
Super User

Your code as written cannot work since all you did was define a variable named DELIMITER that has ',' as the value.

 

To change the delimiter used when reading lines you need to use an INFILE statement.  You can use INFILE DATALINES to read from the in-line data.

 

But for the example lines you showed that will still not work because the line does not have quotes around the embedded delimiter. (What you called a fake delimiter).

 

If you use the DSD option then SAS will allow quotes around a value to hide the "fake" delimiters and automatically remove those quotes from the value.   The DSD option will also make comma the default delimiter.  And it will also treat adjacent delimiters (like in your last example line) as indicating a missing value.

 

So if you can fix your source data to be properly formatted then code like this will work.

data test;
  infile datalines dsd truncover;
  Input No Company :$20. Type :$1.;
Datalines;
1,"Barata,PT",A
2,Riley,A
3,,B
;

If you cannot fix source data to properly quote the embedded delimiters it will be extremely difficult to parse the lines into the fields you want to read.  If , as in your example, there is only ONE field that could possibly have the embedded delimiter then you should be able to read it with some extra logic.

 

Here is one way based on knowing that you expect three fields on each line:

data test2;
  infile datalines dsd truncover;
  length No 8 Company $20 Type $1 ;
  input No Company :$20. @ ;
  do counter=1 to countw(_infile_,',','q')-3 ;
    input extra :$20. @;
    company=catx(',',company,extra);
  end;
  input Type :$1.;
  drop counter extra;
datalines;
1,Barata,PT,A
2,Riley,A
3,,B
;
Ata62439
Calcite | Level 5
I think it will never been solved either until the source data has been fixed by putting the quote string.
I have been searching for an article related to this and none can fix it thru the syntax.
Thanks Tom for your explanation, it really helps
Ksharp
Super User
/*
Your question is very complicated,
You need to know which variable has this extra delimiter .
*/
data demo;
input;
call scan(_infile_,1,p1,l1,',','mq');
call scan(_infile_,-1,p2,l2,',','mq');
length no company type $ 200;
no=substrn(_infile_,p1,l1);
company=substrn(_infile_,p1+l1+1,p2-p1-l1-2);
type=substrn(_infile_,p2);
keep no company type;
Datalines;
1,Barata,PT,A
2,Riley,A
3,,B
;
Run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 778 views
  • 0 likes
  • 4 in conversation