BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lizzy28
Quartz | Level 8

,Hi,

When I tried to import a zipped csv file to SAS, the data was imported successfully. But the values of the date variable are all missing. Below is the sample data:

Sale_datecity_cdbrandMODEL_YEARUnits
1/1/2013501CV20131
1/1/2013501CV20131
1/1/2013CV20131
1/1/2013609A20131
1/1/2013717CV20121
1/1/2013717GM20131
1/1/2013619CV20121
1/1/2013619CV20131
1/1/2013613CV20131
1/1/2013676CV20131

I used the following statement:

filename sale SASZIPAM 'C\Sales.zip'                                                                                                   

                  member="Sales.csv";                                                                                                  

                                                                                                                                       

data newdata;                                                                                                                          

    infile sale (Sales.csv) DLM='2C0D'x dsd missover firstobs=2;                                                                       

    input Sale_date City_CD Brand $ MODEL_YEAR Units;                                                                                

run;

I also tried to set the date variable as character through the statement:

data newdata;                                                                                                                          

    infile sale (Sales.csv) DLM='2C0D'x dsd missover firstobs=2;                                                                       

    input Sale_date $ City_CD Brand $ MODEL_YEAR Units;                                                                                

run;

This way the values of the date variable are truncated.

Anyone has any suggestions?

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Note that there is no need to do that in three steps when one will work just as well.

data newdata;                                                                                                                        

    infile sale (Sales.csv) DLM='2C0D'x dsd missover firstobs=2; 

    informat Sale_date ANYDTDTM.;                                                                  

    input Sale_date City_CD Brand $ MODEL_YEAR Units;                                                                              

    Sale_date=datepart(Sale_date);

    format Sale_date MMDDYY10.;

run;

View solution in original post

7 REPLIES 7
Tom
Super User Tom
Super User

You cannot read the string '1/1/2013' as a number.  You need to tell SAS to use an INFORMAT that will translate that string into a valid date.

Add these lines to your data step.

informat sale_date mmddyy10.;

format sale_date mmddyy10.;

The reason your attempt to read it as character ended up with truncation is because you did not tell SAS what length your variable was. So it used the default length of 8.

Use a LENGTH statement to set the length and your INPUT statement would work.

length sale_date_char $10 ;

lizzy28
Quartz | Level 8

Thanks, Tom.

I am not sure which step to add informat and format statements in the data step, but no matter where I added them, it does not work out. The length statement does not seem to help.

Generally, if it's simply to import a csv file to SAS , I do not have these issues.

Thanks.

Tom
Super User Tom
Super User

The issue of whether the file is coming from within a ZIP file has nothing to do with your issue about reading a date variable.

Let's look at this in two parts.

1) The ZIP file.  You can treat the ZIP file just like any other aggregate file location (that is a directory).  So point the FILENAME to the ZIP and then include the name of the individual file within the ZIP file you want to read in the INFILE statement.  If you have a recent version of SAS then you can use the ZIP engine instead of the unsupported SASZIPAM engine.

 

filename sale zip 'C:\Sales.zip' ;

...

infile sale("Sales.csv") ... ;

2) Reading a comma delimited file. You could use PROC IMPORT, but then SAS would need to guess how to format the variables.  You will get much better performance just specifying that yourself.  It is best to define the variables using a LENGTH or ATTRIB statement.  Apply informats and/or formats using INFORMAT, FORMAT or ATTRIB statements. Then the INPUT statement itself can be very simple.  Note that they data you posted is TAB delimited and not COMMA delimited.  To change from using comma to tab change the hex code 2C to 09 in your DLM= option.

 

data newdata;

  infile sale("Sales.csv") dsd dlm='2C0D'x truncover firstobs=2;

  length Sale_date City_CD 8 Brand $5 MODEL_YEAR Units 8;

  informat Sale_date mmddyy10.;

  format Sale_date mmddyy10.;

  input Sale_date -- Units;

run;



lizzy28
Quartz | Level 8

I see why informat did not work. The date variable was taken in as a datetime variable instead of a date variable (don't why).  Therefore, rather than using MMDDYY10., I use ANYDTDTM..I keep the majority of my original statement:

filename sale SASZIPAM 'C\Sales.zip'                                                                                                  

                  member="Sales.csv";                                                                                                 

data newdata;                                                                                                                         

    infile sale (Sales.csv) DLM='2C0D'x dsd missover firstobs=2;  

   informat Sale_date ANYDTDTM.;                                                                   

    input Sale_date City_CD Brand $ MODEL_YEAR Units;                                                                               

run;

data newdata;

    set newdata;

    format Sale_date date9.;

run;

data newdata;

    set newdata;

    Sale_date=datepart(Sale_date);

    format Sale_date MMDDYY10.;

run;

Tom
Super User Tom
Super User

Note that there is no need to do that in three steps when one will work just as well.

data newdata;                                                                                                                        

    infile sale (Sales.csv) DLM='2C0D'x dsd missover firstobs=2; 

    informat Sale_date ANYDTDTM.;                                                                  

    input Sale_date City_CD Brand $ MODEL_YEAR Units;                                                                              

    Sale_date=datepart(Sale_date);

    format Sale_date MMDDYY10.;

run;

CTorres
Quartz | Level 8

Try reading Sale_date using an informat:

  input Sale_date MMDDYY10. City_CD Brand $ MODEL_YEAR Units; 


CTorres

lizzy28
Quartz | Level 8

Thanks! The statement did not work this case, but we finally found out the way.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 3042 views
  • 6 likes
  • 3 in conversation