BookmarkSubscribeRSS Feed
Oleg_L
Obsidian | Level 7
Hello SAS-users.
I need to read comma delimited file into SAS dataset.
The second character variable "account_name" can contain extra commas
and the next variable "var1" always have values "0" and "1".
My solution is below. But it seems to me a bit awkward.
Could you please suggest something else?
Thanks in advance.

[pre]

data test1 ;
infile datalines truncover dsd;
length account $20. account_name $20. var1 $10. ss_ostd 8. ss_ostc 8. ost_ccy 8. client $10.;
input ;
account=left(scan(_infile_,1,','));
if length(account)=20 then do;
account_name=scan(_infile_,2,',');
var1=trim(left(scan(_infile_,3,',')));
k=4;
do while (trim(left(var1)) not in ('0' '1'));
account_name=trim(account_name)||var1;
var1=trim(left(scan(_infile_,k,',')));
k=k+1;
end;
ss_ostd=input(left(scan(_infile_,k+4,',')),17.);
ss_ostc=input(left(scan(_infile_,k+5,',')),17.);
ost_ccy=input(left(scan(_infile_,k+6,',')),17.);
client=left(scan(_infile_,k+7,','));
end; else delete;
drop k ;
datalines;
20308810700000020292,some text 5,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 1,
20308810000000020293,some,t,ex,t 2,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client2,
20308810300000020294,some text 3,1,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 3,
20308810600000020936,so,me,text,4,0,13440.68,0.00,0.00,0.00,13440.68,0.00,0,client 4,
;

[/pre]
14 REPLIES 14
SAS83
Fluorite | Level 6
Thats interesting.. Let me work on it.. Message was edited by: Ravi Mantripragada
data_null__
Jade | Level 19
IFF you KNOW how many fields there are following the pesky field.

[pre]
data a;
length dlm $1;
dlm = ',';
infile cards dsd dlm=dlm;
input @;
call scan(_infile_,2,start,_n_,',');
call scan(_infile_,-10,end,_n_,',');
length = end-start-1;
put start= end=;

input id:$25. field2 $varying50. length +1 (field3-field11)(:$10.);
list;
*CALL SCAN(, count, position, length <, <, );
datalines;
20308810700000020292,some text 5,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 1,
20308810000000020293,some,t,ex,t 2,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client2,
20308810300000020294,some text 3,1,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 3,
20308810600000020936,so,me,text,4,0,13440.68,0.00,0.00,0.00,13440.68,0.00,0,client 4,
;;;;
run;
proc print;
run;
[/pre]

[pre]
The SAS System 09:02 Tuesday, March 1, 2011 11

f f
l f f f f f f f f i i
s e i i i i i i i i e e
t n e e e e e e e e l l
O a e g l l l l l l l l d d
b r n t i d d d d d d d d 1 1
s t d h d 2 3 4 5 6 7 8 9 0 1

1 22 34 11 20308810700000020292 some text 5 0 6991.53 0.00 0.00 0.00 6991.53 0.00 0 client 1
2 22 36 13 20308810000000020293 some,t,ex,t 2 0 6991.53 0.00 0.00 0.00 6991.53 0.00 0 client2
3 22 34 11 20308810300000020294 some text 3 1 6991.53 0.00 0.00 0.00 6991.53 0.00 0 client 3
4 22 35 12 20308810600000020936 so,me,text,4 0 13440.68 0.00 0.00 0.00 13440.68 0.00 0 client 4
[/pre]
Oleg_L
Obsidian | Level 7
Thank you _null_.

I always learn a lot from your posts here.

Oleg.
Oleg_L
Obsidian | Level 7
There are some difficulties with call scan and my actual data.

If two or more contiguous delimiters exist, CALL SCAN treats them as one.

Is it possible to solve this?
data_null__
Jade | Level 19
Yes, there is an option. M

[pre]
data a;
retain dlm ',';
infile cards dsd dlm=dlm;
input @;
call scan(_infile_,2,start,_n_,dlm,'M');
call scan(_infile_,-10,end,_n_,dlm,'M');
length = end-start-1;
put start= end=;

input id:$25. field2 $varying50. length +1 (field3-field10)(:) field11:$10.;
list;
*CALL SCAN(, count, position, length <, <, );
datalines;
20308810700000020292,some text 5,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 1,
20308810000000020293,some,t,ex,t 2,0,6991.53,,0.00,0.00,6991.53,0.00,0,client2,
20308810300000020294,some text 3,1,6991.53,0.00,0.00,,,0.00,0,client 3,
20308810600000020936,so,me,text,4,0,13440.68,0.00,0.00,0.00,13440.68,0.00,0,client 4,
;;;;
run;
proc print;
run;
[/pre] Message was edited by: data _null_;
Oleg_L
Obsidian | Level 7
Thank you _null_.
I use SAS 9.1.3. There are no options.
data_null__
Jade | Level 19
There are always options.

[pre]
data a;
retain dlm ',';
infile cards dsd dlm=dlm;
input @;
end = length(_infile_)+1;
do j = 1 to 10;
end = find(_infile_,',',-(end-1));
end;
start = find(_infile_,dlm,1);
length = end-start-1;
input id:$25. field2 $varying50. length +1 (field3-field10)(:) field11:$10.;
list;
*CALL SCAN(, count, position, length <, <, );
datalines;
20308810700000020292,some text 5,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 1,
20308810000000020293,some,t,ex,t 2,0,6991.53,,0.00,0.00,6991.53,0.00,0,client2,
20308810300000020294,some text 3,1,6991.53,0.00,0.00,,,0.00,0,client 3,
20308810600000020936,so,me,text,4,0,13440.68,0.00,0.00,0.00,13440.68,0.00,0,client 4,
;;;;
run;
proc print;
run;
[/pre]
Oleg_L
Obsidian | Level 7
Thank you _null_.
It works.
I admire your coding skill last three years.

Oleg.
Ksharp
Super User
I also think a while.



[pre]
data test1 (drop=zero one dif first_pos end_pos);
infile datalines length=len;
input whole $varying200. len;
zero=find(whole,',0,');
one=find(whole,',1,');
if zero and one then end_pos=min(zero,one);
else end_pos=max(zero,one);
first_pos=findc(whole,',')+1;
dif=end_pos - first_pos;
account=scan(whole,1,',');
account_name=substr(whole,first_pos,dif);
client=scan(strip(whole),-1,',');

datalines;
20308810700000020292,some text 5,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 1,
20308810000000020293,some,t,ex,t 2,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client2,
20308810300000020294,some text 3,1,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 3,
20308810600000020936,so,me,text,4,0,13440.68,0.00,0.00,0.00,13440.68,0.00,0,client 4,
;
run;
[/pre]




Ksharp
Oleg_L
Obsidian | Level 7
Thank you for the new approach to my problem.

SAS soft supports our lifes.

Oleg.
deleted_user
Not applicable
Hello,

Another solution which takes advantage of the reverse function and format:

[pre]
data test1 ;
infile datalines truncover dsd;

input account :$20. @;
call scan(_infile_,2,position,length,',');

_infile_=reverse(trim(substr(_infile_,position)));

input @2 client :$10. ost_ccy_ $ ss_ostc_ $ ss_ostd_ $ d1 d2 d3 d4 var1 account_name & :$20. ;

ost_ccy=input(put(ost_ccy_,$revers8.),8.);
ss_ostc=input(put(ss_ostc_,$revers8.),8.);
ss_ostd=input(put(ss_ostd_,$revers8.),8.);

format client $revers10. account_name $revers20.;
drop d: position length ost_ccy_ ss_ostc_ ss_ostd_;

datalines;
20308810700000020292,some text 5,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 1,
20308810000000020293,some,t,ex,t 2,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client2,
20308810300000020294,some text 3,1,6991.53,0.00,0.00,0.00,6991.53,0.00,0,client 3,
20308810600000020936,so,me,text,4,0,13440.68,0.00,0.00,0.00,13440.68,0.00,0,client 4,
;
[/pre]

Marius
Oleg_L
Obsidian | Level 7
Thank you Marius.
Your solution works on my actual data except few cases when the string contains embedded blank before extra comma. But this imperfection is not important and I can continue to work on my task.
See record 2 below for example.

[pre]

data test1 ;
infile datalines truncover dsd;

input account :$20. @;
call scan(_infile_,2,position,length,',');

_infile_=reverse(trim(substr(_infile_,position)));

input @2 client :$10. ost_ccy_ $ ss_ostc_ $ ss_ostd_ $ d1 :$1. d2 :$1. d3 :$1. d4 :$1. var1 :$1. account_name & :$20. ;

ost_ccy=input(put(ost_ccy_,$revers8.),8.);
ss_ostc=input(put(ss_ostc_,$revers8.),8.);
ss_ostd=input(put(ss_ostd_,$revers8.),8.);

format client $revers10. account_name $revers20.;
drop d: position length ost_ccy_ ss_ostc_ ss_ostd_;

datalines;
20308810700000020292,some text 5,0,6991.53,0.00,0.00,0.00,6991.53,0.00,0,,
20308810000000020293,some,t,ex ,t 2,0,6991.53,,0.00,0.00,6991.53,0.00,0,client2,
20308810300000020294,some text 3,1,6991.53,,0.00,0.00,6991.53,0.00,0,client 3,
20308810600000020936,so,me,text,4,0,13440.68,0.00,0.00,0.00,13440.68,0.00,0,,
;

[/pre]
deleted_user
Not applicable

Hello,



ok, then switch the way account_name is read from list input to formatted input and it should be ok:



account_name & :$20. switched to account_name $20.

Marius

Oleg_L
Obsidian | Level 7
Thank you very much.
It's perfect now.

Oleg

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 4492 views
  • 0 likes
  • 5 in conversation