Hi,
I'm trying to read raw data file which is .txt file
Frankfurt,Freight,Direct,$1,464,938
Frankfurt,Freight,Indirect,$198,942
Frankfurt,Freight,Other,$144,685
Frankfurt,Other,Direct,$111,193
Frankfurt,Other,Indirect,$12,057
Frankfurt,Other,Other,$10,717
I'm tried the below syntax:
data revenue;
infile "file-path/file.txt" dlm=",";
input City $ Transport $ Type $ Amount:dollar12.;
run;
my output is
City Transport Type Amount
Frankfur Freight Direct 1
Frankfur Freight Indirect 198
Frankfur Freight Other 144
Frankfur Other Direct 111
Frankfur Other Indirect 12
Frankfur Other Other 10
As you can see, in the amount variable data values are displayed wrongly.
for the value $1,464,938 the numbers after first comma are missing in my output. How to overcome this? if there is delimiter in the same variable how to write the syntax for that?
Hopefully you can contact the creator of the file and ask them to make one that is in a valid format. Values that contain delimiters need to be quoted.
Frankfurt,Freight,Direct,"$1,464,938" Frankfurt,Freight,Indirect,"$198,942"
If this is only happening in the last field then you can use FORMATTED input for that field and SAS will read right through the delimiters. (That will not work if there are other fields after it though as it will go ahead and try to read the characters from the next field also.) So remove the colon modifier in front of the in-line informat in your INPUT statement. That is telling SAS to read that field using LIST MODE instead of FORMATTED mode (even though you have specified and informat). Note there is no need to use DOLLAR as the informat name. That is just an alias for the COMMA informat. There is no need to limit the width to 12 bytes. Just go ahead and use the maximum 32 bytes. Make sure to add the TRUNCOVER option on the INFILE statement instead of the old style MISSOVER option so that SAS does not discard the last value if it is shorter than the width of the informat. While you are at make sure to give SAS some indication that your three character variables might need be longer than the default of 8 bytes.
data revenue;
infile "file-path/file.txt" dlm="," truncover;
input City :$20. Transport :$20. Type :$20. Amount comma32.;
run;
Did you open a CSV in Excel or other spread sheet? That sometimes corrupts values and may have removed quotes from around the currency amounts.
I would expect a proper CSV with dollar amounts to appear as
Frankfurt,Freight,Direct,"$1,464,938" Frankfurt,Freight,Indirect,"$198,942"
which would call for the use of the DSD (delimiter sensitive data) option on the INFILE statement. But that will expect the commas to appear inside quoted values.
data revenue; infile datalines dlm="," dsd; input City $ Transport $ Type $ Amount:dollar12.; datalines; Frankfurt,Freight,Direct,"$1,464,938" Frankfurt,Freight,Indirect,"$198,942" ; run;
Hello, thanks for the quick reply. however I used "dsd" option but it did not work out for me.
And to ans your question, no I did not open the file in any other format. I got the file as is.
Hopefully you can contact the creator of the file and ask them to make one that is in a valid format. Values that contain delimiters need to be quoted.
Frankfurt,Freight,Direct,"$1,464,938" Frankfurt,Freight,Indirect,"$198,942"
If this is only happening in the last field then you can use FORMATTED input for that field and SAS will read right through the delimiters. (That will not work if there are other fields after it though as it will go ahead and try to read the characters from the next field also.) So remove the colon modifier in front of the in-line informat in your INPUT statement. That is telling SAS to read that field using LIST MODE instead of FORMATTED mode (even though you have specified and informat). Note there is no need to use DOLLAR as the informat name. That is just an alias for the COMMA informat. There is no need to limit the width to 12 bytes. Just go ahead and use the maximum 32 bytes. Make sure to add the TRUNCOVER option on the INFILE statement instead of the old style MISSOVER option so that SAS does not discard the last value if it is shorter than the width of the informat. While you are at make sure to give SAS some indication that your three character variables might need be longer than the default of 8 bytes.
data revenue;
infile "file-path/file.txt" dlm="," truncover;
input City :$20. Transport :$20. Type :$20. Amount comma32.;
run;
thank you. The code worked out perfectly.
In the scenario where this value $1,456,568 is not the last one in the row, how do we solve that?
@vijayasri wrote:
thank you. The code worked out perfectly.
In the scenario where this value $1,456,568 is not the last one in the row, how do we solve that?
The best way is to return the file to the source and request that they send a new on in a format that can be scanned.
If you are forced to deal with a file that cannot be re-created you might be able to handle a file that has one field in the middle by counting the number of commas from both ends. So pick off the fields in front and the fields after and then whatever is left is your field with the embedded commas. If there are two such fields in the same line then that is not always possible. In that case you might need to use specific knowledge of what types of values are valid for the field and/or the fields next to it.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.