- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Hoping someone can help and correct the logic I am using. THank you in advance 😉
Trying to import a csv file that is separated by semicolon and the field names start on row 2 and data starts row 3
csv file shoes_startrow.csv (attached):
syntax:
proc import datafile="C:\Users\Other Request\shoes_startrow.csv"
out=Shoe_File
dbms=csv replace;
GETNAMES=NO;
DELIMITER=';';
DATAROW=2;
run;
result:
want:
Region | Product | Subsidiary | Stores | Sales | Inventory | Returns |
Africa | Boot | Addis Ababa | 12 | $29,761 | $191,821 | $769 |
Africa | Men's Casual | Addis Ababa | 4 | $67,242 | $118,036 | $2,284 |
Africa | Men's Dress | Addis Ababa | 7 | $76,793 | $136,273 | $2,433 |
Africa | Sandal | Addis Ababa | 10 | $62,819 | $204,284 | $1,861 |
Africa | Slipper | Addis Ababa | 14 | $68,641 | $279,795 | $1,771 |
Africa | Sport Shoe | Addis Ababa | 4 | $1,690 | $16,634 | $79 |
Africa | Women's Casual | Addis Ababa | 2 | $51,541 | $98,641 | $940 |
Africa | Women's Dress | Addis Ababa | 12 | $108,942 | $311,017 | $3,233 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try this:
proc import datafile="C:\Users\Other Request\shoes_startrow.csv"
out=Shoe_File
dbms=csv replace;
GETNAMES=YES;
DELIMITER=';';
DATAROW=3;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Don't let PROC IMPORT mess up your data when reading text files. Write the data step yourself with FIRSTOBS=3 in the INFILE statement.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This would be the code to read your csv file:
data want;
infile "C:\Users\Other Request\shoes_startrow.csv" dlm=";" dsd firstobs=3 truncover;
input
region :$25.
product :$14.
subsidiary :$12.
stores
sales :dollar12.
inventory :dollar12.
returns :dollar12.
;
format
sales :dollar12.
inventory :dollar12.
returns :dollar12.
;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PS in order to supply an example of your data which we can use for testing, either attach the csv to your post, or open it with a text editor and copy/paste a sufficient number of lines into a window opened with this button:
Do not post data as pictures, especially not as screenshots from Excel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Kurt_Bremser - sorry I thought I did attached the csv file.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
That file is even more messed up that you originally suggested.
11 data _null_; 12 infile "c:\downloads\shoes_startrow.csv"; 13 input; 14 list; 15 run; NOTE: The infile "c:\downloads\shoes_startrow.csv" is: Filename=c:\downloads\shoes_startrow.csv, RECFM=V,LRECL=32767,File Size (bytes)=30195, Last Modified=06Jul2022:09:28:09, Create Time=06Jul2022:09:28:09 RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 1 xxxxx;;;;;;,,,,,, 17 2 Region;Product;Subsidiary;Stores;Sales;Inventory;Returns,,,,,, 62 3 "Africa;Boot;Addis Ababa;12;""$29;761"";""$191;821"";$769",,,,,, 64 4 "Africa;Men's Casual;Addis Ababa;4;""$67;242"";""$118;036"";""$2;284""",,,,,, 77 5 "Africa;Men's Dress;Addis Ababa;7;""$76;793"";""$136;273"";""$2;433""",,,,,, 76 6 "Africa;Sandal;Addis Ababa;10;""$62;819"";""$204;284"";""$1;861""",,,,,, 72
In addition the the extra line at the top and the extra line at the bottom every line is corrupted. Instead of having 7 values separated by semicolons most of the lines have one value followed by bunch of commas.
Are you sure this is your ORIGINAL file? Or did you accidentally let EXCEL automatically open the CSV file and then save that modified version of the file back as a CSV file? In general you should never let EXCEL automatically open a CSV file (other than for simple exploring) as it will make decisions about how the convert the text into a spreadsheet including changing some of the "cell" values.
If the file is like this then first convert it to a real CSV file and then read that. You could remove the header and trailer lines at the same time. Since the resulting file is not using commas as the delimiter let's not use CSV as the extension on the file.
filename have 'myfile.csv';
filename want 'myfile_fixed.txt';
data _null_;
infile have dsd truncover end=eof;
file want;
input line :$1000. ;
if _n_=1 or eof then putlog _n_= _infile_;
else put line;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So your file is not a semicolon-separated pseudo-csv (as created by Excel), but a real comma separated volume with all the information in the first column, which is enclosed in double quotes, is internally separated by semicolons and also contains double quotes. And some of the numbers have semicolons instead of commas. In short, an unbelievable mess. Who threw this piece of computing crap at you?
Anyway, this code reads your file successfully, but I include it only to show what SAS is capable of. You should not try to make sense out of this file, return it to sender, with a solid nastygram attached to it.
data want;
infile "~/shoes_startrow.csv" dlm="," dsd firstobs=3 truncover;
length
region $25
product $14
subsidiary $12
stores 8
sales 8
inventory 8
returns 8
;
format
sales
inventory
returns dollar12.
;
input line :$100.;
if not index(line,"row count");
region = scan(line,1,";");
product = scan(line,2,";");
subsidiary = scan(line,3,";");
stores = input(scan(line,4,";"),best.);
_sales = compress(scan(line,5,";") !! scan(line,6,";"),'"');
sales = input(_sales,dollar20.);
_inventory = compress(scan(line,7,";") !! scan(line,8,";"),'"');
inventory = input(_inventory,dollar20.);
_returns = compress(scan(line,9,";") !! scan(line,10,";"),'"');
returns = input(_returns,dollar20.);
drop _: line;
run;