Learning SAS? Welcome to the exclusive online community for all SAS learners.

Dates not being read properly.

Reply
Occasional Contributor
Posts: 6

Dates not being read properly.

I am trying to read the following file into my SAS code but it is not reading the dates correctly. Could someone please help?

Super User
Posts: 17,828

Re: Dates not being read properly.

Please post the code that 'is not working'.

Also, when you say it's not reading the dates correctly, what do you mean by that?

Occasional Contributor
Posts: 6

Re: Dates not being read properly.

I mean that when I try the line or column input, SAS omits the values for some of the observations. 

I tried the following:

 

data gappy;
infile "/home/opoudyal0/sasuser.v94/gappy.txt" MISSOVER FIRSTOBS=2;
input BRTHDTC $10. RANDDTC $9. TRTSDTC :$10. TRTEDTC :$10. RACE $25. ETHNIC $22. COMPREAS $17. DISCREAS $21. COUNTRY $3. STATE $2. INVID SCRNID SUBJID SUBJNIT $ SEX $ COMPTRF $ COMPSTF $ TRT01PN ;
RUN;

Super User
Posts: 10,500

Re: Dates not being read properly.

[ Edited ]

Since your date data looks something like:

BRTHDTC	        RANDDTC	        TRTSDTC	 
11/5/1969	1/1/2013	1/14/2013	2016	       
11/13/1968	1/1/2013			                   
8/10/1960	20-Oct-13	20-Nov-13	16-Dec-15 
10/26/1966				                         
1957-06				                            
3/20/1938	8/10/2013	10/13/2013	12/15/2016
11/13/1958	12-Nov-13	4-Dec-13	16-Jan-16    
3/26/1971	12-Jan-13	1/28/2013	2/4/2015  
9/12/1958	1/15/2013	2/15/2013	2/4/2016  
11/9/1943	5/10/2013	5/15/2013		       

You are missing values in the input set. Are those the ones you are mentioning with "SAS omits the values for some of the observations"? It looks like your data might be tab delimited. You may want to try adding: dlm='09'x to the INFILE statement.

 

 

You might want to try using an actual date informat such as ANYDTDTE. so that you can manipulate the values better after they are read and assign a common format to all of the date variables.

Occasional Contributor
Posts: 6

Re: Dates not being read properly.

with the code I was using, one of the the values being ommited was actually the 3 in 12-Nov-13 in RANDDTC. I tried, the tab delimited earlier but to no avail. I am just beginning so I am still having a lot of consufion.

Super User
Posts: 17,828

Re: Dates not being read properly.

This is a really ugly file. The tab isn't even being used consistently as a delimiter, at least in the sample you provided.

You're going to have to do a lot of data cleaning with this data.

 

Here's what I cobbled together and I know it's not correct. I used the standard, use PROC IMPORT, copy the code from the log and start modifying from there. It's a starting point. I'm not even sure what to recommend to fix it, besides manually verifying the data...it's going to be very difficult to trust the data and results. I hope I'm wrong and someone else has a quick way to read this in.

 

   /**********************************************************************
   *   PRODUCT:   SAS
   *   VERSION:   9.4
   *   CREATOR:   External File Interface
   *   DATE:      17JAN17
   *   DESC:      Generated SAS Datastep Code
   *   TEMPLATE SOURCE:  (None Specified.)
   ***********************************************************************/
      data WORK.GAPPY    ;
      %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
      infile 'C:\Users\fareeza.khurshed\Downloads\gappy.txt' delimiter='09'x truncover DSD firstobs=2 ;
         informat BRTHDTC $10. ;
         informat RANDDTC $10.;
         informat TRTSDTC $10. ;
         informat TRTEDTC $25. ;
         informat RACE $49. ;
         informat ETHNIC $44. ;
         informat COMPREAS $44. ;
         informat DISCREAS $25. ;
         informat COUNTRY_STATE $5. ;
         informat INVID $4. ;
         informat SCRNID $4. ;
         informat SUBJID $13. ;
         informat SUBJINIT_SEX_COMPTRF_COMPSTF $13. ;
         informat TRT01PN $7. ;

         format BRTHDTC $10. ;
         format RANDDTC $10. ;
         format TRTSDTC $10. ;
         format TRTEDTC $25. ;
         format RACE $49. ;
         format ETHNIC $44. ;
         format COMPREAS $44. ;
         format DISCREAS $25. ;
         format COUNTRY_STATE $5. ;
         format INVID $4. ;
         format SCRNID $4. ;
         format SUBJID $13. ;
         format SUBJINIT_SEX_COMPTRF_COMPSTF $13. ;
         format TRT01PN $7. ;

      input
                  BRTHDTC $
                  RANDDTC $
                  TRTSDTC $
                  TRTEDTC $
                  RACE $
                  ETHNIC $
                  COMPREAS $
                  DISCREAS $
                  COUNTRY_STATE $
                  INVID $
                  SCRNID $
                  SUBJID $
                  SUBJINIT_SEX_COMPTRF_COMPSTF $
                  TRT01PN $
      ;
      if _ERROR_ then call symputx('_EFIERR_',1);  /* set ERROR detection macro variable
! */
      run;

 

 

 

 

Super User
Posts: 10,500

Re: Dates not being read properly.

One thing about placing formats on the input statement is that approach is originally designed to work with fixed column data. When you have Input VAR $25. ; it wants to read all 25 characters even if the value ends after 6 or 7. So reading 1957-10 with a $9 reads the next to characters. If they happen to be the tabs then the next variable read may not quite align and you get spaces at the front or truncated at the end.

 

Inconsisten presence of tabs is going to be hard to fix though.

Ask a Question
Discussion stats
  • 6 replies
  • 177 views
  • 0 likes
  • 3 in conversation