DATA Step, Macro, Functions and more

Read comma separated values with extra commas

Reply
Regular Contributor
Posts: 151

Read comma separated values with extra commas

Hello SAS-users.
I need to read comma delimited file into SAS dataset.
The second character variable "account_name" can contain extra commas
and the next variable "var1" always have values "0" and "1".
My solution is below. But it seems to me a bit awkward.
Could you please suggest something else?
Thanks in advance.

[pre]

data test1 ;
infile datalines truncover dsd;
length account $20. account_name $20. var1 $10. ss_ostd 8. ss_ostc 8. ost_ccy 8. client $10.;
input ;
account=left(scan(_infile_,1,','));
if length(account)=20 then do;
account_name=scan(_infile_,2,',');
var1=trim(left(scan(_infile_,3,',')));
k=4;
do while (trim(left(var1)) not in ('0' '1'));
account_name=trim(account_name)||var1;
var1=trim(left(scan(_infile_,k,',')));
k=k+1;
end;
ss_ostd=input(left(scan(_infile_,k+4,',')),17.);
ss_ostc=input(left(scan(_infile_,k+5,',')),17.);
ost_ccy=input(left(scan(_infile_,k+6,',')),17.);
client=left(scan(_infile_,k+7,','));
end; else delete;
drop k ;
datalines;
20308810700000020292,some text 5,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 1,
20308810000000020293,some,t,ex,t 2,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client2,
20308810300000020294,some text 3,1,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 3,
20308810600000020936,so,me,text,4,0,13440.68,0.00,0.00,0.00,13440.68,0.00,0,client 4,
;

[/pre]
Contributor
Posts: 32

Re: Read comma separated values with extra commas

Thats interesting.. Let me work on it.. Message was edited by: Ravi Mantripragada
Respected Advisor
Posts: 3,777

Re: Read comma separated values with extra commas

IFF you KNOW how many fields there are following the pesky field.

[pre]
data a;
length dlm $1;
dlm = ',';
infile cards dsd dlm=dlm;
input @;
call scan(_infile_,2,start,_n_,',');
call scan(_infile_,-10,end,_n_,',');
length = end-start-1;
put start= end=;

input id:$25. field2 $varying50. length +1 (field3-field11)(:$10.);
list;
*CALL SCAN(, count, position, length <, <, );
datalines;
20308810700000020292,some text 5,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 1,
20308810000000020293,some,t,ex,t 2,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client2,
20308810300000020294,some text 3,1,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 3,
20308810600000020936,so,me,text,4,0,13440.68,0.00,0.00,0.00,13440.68,0.00,0,client 4,
;;;;
run;
proc print;
run;
[/pre]

[pre]
The SAS System 09:02 Tuesday, March 1, 2011 11

f f
l f f f f f f f f i i
s e i i i i i i i i e e
t n e e e e e e e e l l
O a e g l l l l l l l l d d
b r n t i d d d d d d d d 1 1
s t d h d 2 3 4 5 6 7 8 9 0 1

1 22 34 11 20308810700000020292 some text 5 0 6991.53 0.00 0.00 0.00 6991.53 0.00 0 client 1
2 22 36 13 20308810000000020293 some,t,ex,t 2 0 6991.53 0.00 0.00 0.00 6991.53 0.00 0 client2
3 22 34 11 20308810300000020294 some text 3 1 6991.53 0.00 0.00 0.00 6991.53 0.00 0 client 3
4 22 35 12 20308810600000020936 so,me,text,4 0 13440.68 0.00 0.00 0.00 13440.68 0.00 0 client 4
[/pre]
Regular Contributor
Posts: 151

Re: Read comma separated values with extra commas

Thank you _null_.

I always learn a lot from your posts here.

Oleg.
Regular Contributor
Posts: 151

Re: Read comma separated values with extra commas

There are some difficulties with call scan and my actual data.

If two or more contiguous delimiters exist, CALL SCAN treats them as one.

Is it possible to solve this?
Respected Advisor
Posts: 3,777

Re: Read comma separated values with extra commas

Yes, there is an option. M

[pre]
data a;
retain dlm ',';
infile cards dsd dlm=dlm;
input @;
call scan(_infile_,2,start,_n_,dlm,'M');
call scan(_infile_,-10,end,_n_,dlm,'M');
length = end-start-1;
put start= end=;

input id:$25. field2 $varying50. length +1 (field3-field10)(Smiley Happy field11:$10.;
list;
*CALL SCAN(, count, position, length <, <, );
datalines;
20308810700000020292,some text 5,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 1,
20308810000000020293,some,t,ex,t 2,0,6991.53,,0.00,0.00,6991.53,0.00,0,client2,
20308810300000020294,some text 3,1,6991.53,0.00,0.00,,,0.00,0,client 3,
20308810600000020936,so,me,text,4,0,13440.68,0.00,0.00,0.00,13440.68,0.00,0,client 4,
;;;;
run;
proc print;
run;
[/pre] Message was edited by: data _null_;
Regular Contributor
Posts: 151

Re: Read comma separated values with extra commas

Thank you _null_.
I use SAS 9.1.3. There are no options.
Respected Advisor
Posts: 3,777

Re: Read comma separated values with extra commas

There are always options.

[pre]
data a;
retain dlm ',';
infile cards dsd dlm=dlm;
input @;
end = length(_infile_)+1;
do j = 1 to 10;
end = find(_infile_,',',-(end-1));
end;
start = find(_infile_,dlm,1);
length = end-start-1;
input id:$25. field2 $varying50. length +1 (field3-field10)(Smiley Happy field11:$10.;
list;
*CALL SCAN(, count, position, length <, <, );
datalines;
20308810700000020292,some text 5,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 1,
20308810000000020293,some,t,ex,t 2,0,6991.53,,0.00,0.00,6991.53,0.00,0,client2,
20308810300000020294,some text 3,1,6991.53,0.00,0.00,,,0.00,0,client 3,
20308810600000020936,so,me,text,4,0,13440.68,0.00,0.00,0.00,13440.68,0.00,0,client 4,
;;;;
run;
proc print;
run;
[/pre]
Regular Contributor
Posts: 151

Re: Read comma separated values with extra commas

Thank you _null_.
It works.
I admire your coding skill last three years.

Oleg.
Super User
Posts: 9,687

Re: Read comma separated values with extra commas

I also think a while.



[pre]
data test1 (drop=zero one dif first_pos end_pos);
infile datalines length=len;
input whole $varying200. len;
zero=find(whole,',0,');
one=find(whole,',1,');
if zero and one then end_pos=min(zero,one);
else end_pos=max(zero,one);
first_pos=findc(whole,',')+1;
dif=end_pos - first_pos;
account=scan(whole,1,',');
account_name=substr(whole,first_pos,dif);
client=scan(strip(whole),-1,',');

datalines;
20308810700000020292,some text 5,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 1,
20308810000000020293,some,t,ex,t 2,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client2,
20308810300000020294,some text 3,1,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 3,
20308810600000020936,so,me,text,4,0,13440.68,0.00,0.00,0.00,13440.68,0.00,0,client 4,
;
run;
[/pre]




Ksharp
Regular Contributor
Posts: 151

Re: Read comma separated values with extra commas

Thank you for the new approach to my problem.

SAS soft supports our lifes.

Oleg.
N/A
Posts: 0

Re: Read comma separated values with extra commas

Hello,

Another solution which takes advantage of the reverse function and format:

[pre]
data test1 ;
infile datalines truncover dsd;

input account :$20. @;
call scan(_infile_,2,position,length,',');

_infile_=reverse(trim(substr(_infile_,position)));

input @2 client :$10. ost_ccy_ $ ss_ostc_ $ ss_ostd_ $ d1 d2 d3 d4 var1 account_name & :$20. ;

ost_ccy=input(put(ost_ccy_,$revers8.),8.);
ss_ostc=input(put(ss_ostc_,$revers8.),8.);
ss_ostd=input(put(ss_ostd_,$revers8.),8.);

format client $revers10. account_name $revers20.;
drop d: position length ost_ccy_ ss_ostc_ ss_ostd_;

datalines;
20308810700000020292,some text 5,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 1,
20308810000000020293,some,t,ex,t 2,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client2,
20308810300000020294,some text 3,1,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 3,
20308810600000020936,so,me,text,4,0,13440.68,0.00,0.00,0.00,13440.68,0.00,0,client 4,
;
[/pre]

Marius
Regular Contributor
Posts: 151

Re: Read comma separated values with extra commas

Thank you Marius.
Your solution works on my actual data except few cases when the string contains embedded blank before extra comma. But this imperfection is not important and I can continue to work on my task.
See record 2 below for example.

[pre]

data test1 ;
infile datalines truncover dsd;

input account :$20. @;
call scan(_infile_,2,position,length,',');

_infile_=reverse(trim(substr(_infile_,position)));

input @2 client :$10. ost_ccy_ $ ss_ostc_ $ ss_ostd_ $ d1 :$1. d2 :$1. d3 :$1. d4 :$1. var1 :$1. account_name & :$20. ;

ost_ccy=input(put(ost_ccy_,$revers8.),8.);
ss_ostc=input(put(ss_ostc_,$revers8.),8.);
ss_ostd=input(put(ss_ostd_,$revers8.),8.);

format client $revers10. account_name $revers20.;
drop d: position length ost_ccy_ ss_ostc_ ss_ostd_;

datalines;
20308810700000020292,some text 5,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,,
20308810000000020293,some,t,ex ,t 2,0,6991.53,,0.00,0.00,6991.53,0.00,0,client2,
20308810300000020294,some text 3,1,6991.53,,0.00,0.00,6991.53,0.00,0,client 3,
20308810600000020936,so,me,text,4,0,13440.68,0.00,0.00,0.00,13440.68,0.00,0,,
;

[/pre]
N/A
Posts: 0

Re: Read comma separated values with extra commas

Hello,



ok, then switch the way account_name is read from list input to formatted input and it should be ok:



account_name & :$20. switched to account_name $20.

Marius

Regular Contributor
Posts: 151

Re: Read comma separated values with extra commas

Thank you very much.
It's perfect now.

Oleg
Ask a Question
Discussion stats
  • 14 replies
  • 843 views
  • 0 likes
  • 5 in conversation