- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am trying to import a CSV file using this code:
proc import datafile="filepath"
out=vacc_us
dbms=csv
replace;
getnames=yes;
datarow=5;
the raw file looks like this:
So want row 4 to be the column headers and the data to start at row 5. I essentially want rows 1,2 and 3 deleted. But the code I used doesn't quite work for this. I have tried using getnames=no; datarow=4;
and some other combinations but it never works out with the right column headers. Any advice?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try data row = 4 and getnames = yes and let us know if that works.
@marleeakerson wrote:
Hello,
I am trying to import a CSV file using this code:
proc import datafile="filepath" out=vacc_us dbms=csv replace; getnames=yes; datarow=5;
the raw file looks like this:
So want row 4 to be the column headers and the data to start at row 5. I essentially want rows 1,2 and 3 deleted. But the code I used doesn't quite work for this. I have tried using
getnames=no; datarow=4;
and some other combinations but it never works out with the right column headers. Any advice?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for replying!
Unfortunately, that does not work. The data looks like this after that switch:
I need the data that is in line 1 to be the column headers
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data _null_;
infile 'path to csv' firstobs=4;
file 'path to new csv';
input;
put _infile_;
run;
Then import your new file?
The code above is just skeleton/pseudocode, untested.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
My quick and dirty approach is to make a copy of the CSV, delete those first three rows before using proc import.
Then copy the data step generated by the proc import from the log, paste into the editor and clean up the code (you are going to have some very ugly variable names that likely should be labels. I know I am not going to type a variable name like People_in_ltc_with_2_doses. I would name the variable(s) Dose1 and Dose2 and then use that text as the LABEL).
You are also likely to have values that should be numeric come in as character because of the N/A values. I would create a custom informat to read the N/A as missing to avoid the next step of fixing values.
Example:
proc format; invalue myna (upcase) 'N/A', ' ' = . other = [best12.] ; run; data example; infile datalines dlm=',' dsd; informat value1 value2 myna.; input value1 value2; datalines; 123,456 n/a,999 16,N/A ,888 ;
Modify the codes Infile statement to point to the original, use the proper Firstobs option to read data at line 5.
I am assuming you will have multiple files to read as time goes on and will need to repeat this. In which case you want to get away from proc import as quickly as possible.
Change the infile and data set name for output of different files read.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Why are you using PROC IMPORT to read a delimited text file what has only four variables?
Just write your own data step.
data vacc_us ;
infile "filepath" dsd truncover firstobs=5;
input state :$20. LTC OneDose TwoDose;
run;
Plus now you could read the timestamp from the second row if you wanted.