BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lizzy28
Quartz | Level 8

,Hi,

When I tried to import a zipped csv file to SAS, the data was imported successfully. But the values of the date variable are all missing. Below is the sample data:

Sale_datecity_cdbrandMODEL_YEARUnits
1/1/2013501CV20131
1/1/2013501CV20131
1/1/2013CV20131
1/1/2013609A20131
1/1/2013717CV20121
1/1/2013717GM20131
1/1/2013619CV20121
1/1/2013619CV20131
1/1/2013613CV20131
1/1/2013676CV20131

I used the following statement:

filename sale SASZIPAM 'C\Sales.zip'                                                                                                   

                  member="Sales.csv";                                                                                                  

                                                                                                                                       

data newdata;                                                                                                                          

    infile sale (Sales.csv) DLM='2C0D'x dsd missover firstobs=2;                                                                       

    input Sale_date City_CD Brand $ MODEL_YEAR Units;                                                                                

run;

I also tried to set the date variable as character through the statement:

data newdata;                                                                                                                          

    infile sale (Sales.csv) DLM='2C0D'x dsd missover firstobs=2;                                                                       

    input Sale_date $ City_CD Brand $ MODEL_YEAR Units;                                                                                

run;

This way the values of the date variable are truncated.

Anyone has any suggestions?

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Note that there is no need to do that in three steps when one will work just as well.

data newdata;                                                                                                                        

    infile sale (Sales.csv) DLM='2C0D'x dsd missover firstobs=2; 

    informat Sale_date ANYDTDTM.;                                                                  

    input Sale_date City_CD Brand $ MODEL_YEAR Units;                                                                              

    Sale_date=datepart(Sale_date);

    format Sale_date MMDDYY10.;

run;

View solution in original post

7 REPLIES 7
Tom
Super User Tom
Super User

You cannot read the string '1/1/2013' as a number.  You need to tell SAS to use an INFORMAT that will translate that string into a valid date.

Add these lines to your data step.

informat sale_date mmddyy10.;

format sale_date mmddyy10.;

The reason your attempt to read it as character ended up with truncation is because you did not tell SAS what length your variable was. So it used the default length of 8.

Use a LENGTH statement to set the length and your INPUT statement would work.

length sale_date_char $10 ;

lizzy28
Quartz | Level 8

Thanks, Tom.

I am not sure which step to add informat and format statements in the data step, but no matter where I added them, it does not work out. The length statement does not seem to help.

Generally, if it's simply to import a csv file to SAS , I do not have these issues.

Thanks.

Tom
Super User Tom
Super User

The issue of whether the file is coming from within a ZIP file has nothing to do with your issue about reading a date variable.

Let's look at this in two parts.

1) The ZIP file.  You can treat the ZIP file just like any other aggregate file location (that is a directory).  So point the FILENAME to the ZIP and then include the name of the individual file within the ZIP file you want to read in the INFILE statement.  If you have a recent version of SAS then you can use the ZIP engine instead of the unsupported SASZIPAM engine.

 

filename sale zip 'C:\Sales.zip' ;

...

infile sale("Sales.csv") ... ;

2) Reading a comma delimited file. You could use PROC IMPORT, but then SAS would need to guess how to format the variables.  You will get much better performance just specifying that yourself.  It is best to define the variables using a LENGTH or ATTRIB statement.  Apply informats and/or formats using INFORMAT, FORMAT or ATTRIB statements. Then the INPUT statement itself can be very simple.  Note that they data you posted is TAB delimited and not COMMA delimited.  To change from using comma to tab change the hex code 2C to 09 in your DLM= option.

 

data newdata;

  infile sale("Sales.csv") dsd dlm='2C0D'x truncover firstobs=2;

  length Sale_date City_CD 8 Brand $5 MODEL_YEAR Units 8;

  informat Sale_date mmddyy10.;

  format Sale_date mmddyy10.;

  input Sale_date -- Units;

run;



lizzy28
Quartz | Level 8

I see why informat did not work. The date variable was taken in as a datetime variable instead of a date variable (don't why).  Therefore, rather than using MMDDYY10., I use ANYDTDTM..I keep the majority of my original statement:

filename sale SASZIPAM 'C\Sales.zip'                                                                                                  

                  member="Sales.csv";                                                                                                 

data newdata;                                                                                                                         

    infile sale (Sales.csv) DLM='2C0D'x dsd missover firstobs=2;  

   informat Sale_date ANYDTDTM.;                                                                   

    input Sale_date City_CD Brand $ MODEL_YEAR Units;                                                                               

run;

data newdata;

    set newdata;

    format Sale_date date9.;

run;

data newdata;

    set newdata;

    Sale_date=datepart(Sale_date);

    format Sale_date MMDDYY10.;

run;

Tom
Super User Tom
Super User

Note that there is no need to do that in three steps when one will work just as well.

data newdata;                                                                                                                        

    infile sale (Sales.csv) DLM='2C0D'x dsd missover firstobs=2; 

    informat Sale_date ANYDTDTM.;                                                                  

    input Sale_date City_CD Brand $ MODEL_YEAR Units;                                                                              

    Sale_date=datepart(Sale_date);

    format Sale_date MMDDYY10.;

run;

CTorres
Quartz | Level 8

Try reading Sale_date using an informat:

  input Sale_date MMDDYY10. City_CD Brand $ MODEL_YEAR Units; 


CTorres

lizzy28
Quartz | Level 8

Thanks! The statement did not work this case, but we finally found out the way.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2605 views
  • 6 likes
  • 3 in conversation