When i am trying to import a CSV file directly from zipped file(winzip) i get a extra column in the the SAS dataset.I also tried to manually unzip the zip file (winzip)and the import the CSV file the sas dataset created was fine(no extra blank row).I really don't understand why this changes are observed.
eg;
after importing data from zipped file data looks below.
| product | substance | code | 
| A | M | 1 | 
| B | N | 2 | 
| C | O | 3 | 
| D | P | 4 | 
| E | Q | 5 | 
| F | R | 6 | 
| G | S | 7 | 
Actual table should look like below.
| product | substance | code | 
| A | M | 1 | 
| B | N | 2 | 
| C | O | 3 | 
| D | P | 4 | 
| E | Q | 5 | 
| F | R | 6 | 
| G | S | 7 | 
code:
data check;
infile inzip("data.csv")
delimiter = ',' MISSOVER DSD firstobs=2 ;
length product $20. substance $18. code 6.;
input product $ substance $ code;
run;
Thanks in Advance!
Please post the original csv file before running it through winzip, and the zip file itself.
Sorry, I am going to start with What? Do you have a Zip file which is called data.csv? Why, if it is a zip file, call it a .zip! There is a reason file extensions are there. Its also not helpful to call a file data.
What happens when you import the unzipped CSV file directly? Do you get the same blank obs - I would suspect so. Its most likely that you have special characters in your data like line feed. Can't tell without the file.
Thanks for your reply !
No the zip file name is " filename inzip ZIP "C:\Users\WSingh\Data\diabetes\201809\A201809_diab_ws.zip";" i forgot to add this file name initially in my post .
In this zip file there is a data csv file which i need to import in sas without unzipping it. i am able to do this but there is a blank row created due to some reason which i am trying to find out.
I cannot upload the file because of data security issue.hope you understand this!
Thanks for you help!!
"In this zip file there is a data csv file which i need to import in sas without unzipping it." - but you can extract the file and try it manually to test that the file can be imported correctly just using plain import procedure to see if it is the zip part, or if its just the data can you not?
"I cannot upload the file because of data security issue.hope you understand this!" - yes, but do understand that I cannot solve theoretical problems.
The issue is most likely in your csv data, other than that I don't see what help we can give.
The problem is in your data. Unzip the file manually, and look at the csv with a proper editor like notepad++. Hex display mode will let you find the culprit.
@Wsingh wrote:
Thanks for your reply !
No the zip file name is " filename inzip ZIP "C:\Users\WSingh\Data\diabetes\201809\A201809_diab_ws.zip";" i forgot to add this file name initially in my post .
In this zip file there is a data csv file which i need to import in sas without unzipping it. i am able to do this but there is a blank row created due to some reason which i am trying to find out.
I cannot upload the file because of data security issue.hope you understand this!
Thanks for you help!!
I might suspect an errant end-of-line or carriage-control character in the body of the csv. Especially if any of the data were originally manually entered into a spreadsheet that is then converted to CSV. Users find the dardest ways to mess with file formats when forcing data into "cells".
Is there only one row like that or do you get multiple observations with missing data?
You might run proc freq on your resulting data set for the first few variables and see if some of the values appear as if they were from a different column. That can be an indicator of an extra linefeed somewhere.
OR if the original CSV is the result of concatenating multiple other files you might have a blank line because of something in the concatenation process when a file is missing or empty.
Basically look at the unzipped version of the file with intense scrutiny and tools that will reveal odd characters like linefeed, carriage return, vertical tab and such.
A ZIP file is like a directory.
The syntax xxx('abc.csv') is how you reference the member named abc.csv in the aggregate file location pointed to be they fileref xxx.
What does your SAS log say?
If you are getting trouble in the output at row 5 for example then try modifying your DATA step to show the raw data lines around that point.
data check;
  infile inzip("data.csv") dsd truncover firstobs=2;
  length product $20 substance $18 code 6 ;
  input product substance code;
  if _n_ between 4 and 6 then list;
run;PS Length values do not need decimal points. They are always integers.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
