Dear all,
I can't get SAS to read the following data correctly using list input. Any Ideas???
data LFC1819;
input Pos :$2.
Pname &$30.
Fee :comma10.;
datalines;
DM Fabinho 30,000,000
GK Alisson Becker 76,000,000
DM Naby Keita 55,000,000
FW Xherdan Shaqiri 12,000,000
;
Actually, it's only the the missing double blank between Pname and Fee that lets your INPUT statement fail. So, another approach would be to let SAS insert the missing blank:
data LFC1819;
input @;
_infile_=prxchange('s/( [\d,\.]+)/ $1/',1,_infile_);
input Pos :$2.
Pname &$30.
Fee :comma10.;
datalines;
DM Fabinho 30,000,000
GK Alisson Becker 76,000,000
DM Naby Keita 55,000,000
FW Xherdan Shaqiri 12,000,000
FW Cristiano Ronaldo (aka CR7) 112,000,000
?? John Doe .
;
I've added two more data lines to demonstrate that digits contained in names (not preceded by a blank, though) and missing values of Fee are handled correctly.
I would use a specific delimiter between the data values to avoid any confusion, see code below.
when using the "&" modifier you need two consecutive blanks to delimit the data values.
data LFC1819;
infile cards dlm=";";
input
Pos :$2.
Pname : $30.
Fee : comma10.
;
datalines4;
DM;Fabinho;30,000,000
GK;Alisson Becker;76,000,000
DM;Naby Keita;55,000,000
FW;Xherdan Shaqiri;12,000,000
;;;;
My preferred action:
go to the source of the data and ask for a better designed layout, with delimiters that do not appear in data fields, or with quotes around data fields, so that the dsd option can be used.
Right now, you would have to read the whole input line (_infile_), and extract the first and last "word" for pos and fee (use findc() to find the first and last delimiter), and put the remaining part into pname. You can imagine how "brittle" such a process is, and how much work you'd have maintaining that.
A simple delimiter change, and all your problems are gone:
data LFC1819;
infile datalines dlm=';';
input
Pos :$2.
Pname :$30.
Fee :comma10.
;
datalines4;
DM;Fabinho;30,000,000
GK;Alisson Becker;76,000,000
DM;Naby Keita;55,000,000
FW;Xherdan Shaqiri;12,000,000
;;;;
run;
As suggested by @BrunoMueller, it would be easier if you rewrote your infile. But assuming that you get it from someone else, that is not always possible. In which case you will have to read the data to temporary fields and then parse, e.g.:
data LFC1819;
length pos $2 Pname $30 str2 str3 $20;
infile cards truncover;
input pos--str3;
if lengthn(str3) then do;
call catx(' ',pname,str2);
Fee=input(str3,comma10.);
end;
else
Fee=input(str2,comma10.);
drop str2 str3;
datalines;
DM Fabinho 30,000,000
GK Alisson Becker 76,000,000
DM Naby Keita 55,000,000
FW Xherdan Shaqiri 12,000,000
;run;
Actually, it's only the the missing double blank between Pname and Fee that lets your INPUT statement fail. So, another approach would be to let SAS insert the missing blank:
data LFC1819;
input @;
_infile_=prxchange('s/( [\d,\.]+)/ $1/',1,_infile_);
input Pos :$2.
Pname &$30.
Fee :comma10.;
datalines;
DM Fabinho 30,000,000
GK Alisson Becker 76,000,000
DM Naby Keita 55,000,000
FW Xherdan Shaqiri 12,000,000
FW Cristiano Ronaldo (aka CR7) 112,000,000
?? John Doe .
;
I've added two more data lines to demonstrate that digits contained in names (not preceded by a blank, though) and missing values of Fee are handled correctly.
That solve the mystery!!
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.