BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
wlierman
Lapis Lazuli | Level 10

I have two datasets for covid cases and covid contacts

 

Both datasets have date fields with values like the following

 

                   11SEP20:09:19:55

                   11SEP20:08:48:00  and so on through

 

                    26OCT21:00:00:00

 

What I would like to do is extract the datepart and then apply a format that could be used as a BY variable in a sort of each data set and then a merge.

 

I set up two sql queries

Cases ***;

Proc sql;
Create Table SASCHART.Epi_Cases_Export AS
SELECT  CaseID,
        CaseStatus,
        County, 
		TrueCaseDate AS Date label="Date"
FROM SASCHART.EpiExport_Cases_All_Fields
Where '01MAR20'd <= datepart(Date) <= '05NOV21'd
Order by Date;
quit;


Contacts ***;

Proc sql;
Create Table SASCHART.Arias_Contacts_Trace AS
SELECT  Contact_Person_ID,
        CaseStatus,
        County_1 AS County label = "County", 
		Created_On AS Date label="Date"
FROM SASCHART.Arias_EpiCase_AriasContacts
Where '01MAR20'd <= datepart(Date) <= '05NOV21'd
And CaseStatus ne " "
Order by Date;
quit;

But the merge

Data SASCHART.EPI_Cases_ARIAS_Contacts_Merge;
Merge SASCHART.Epi_Cases_Export (in=EPI)
      SASCHART.Arias_Contacts_Trace(in=ARIAS);
by Date;
If EPI = ARIAS;
run;

did not produce any match. 

 Data SASCHART.EPI_Cases_ARIAS_Contacts_Merge;
5868  Merge SASCHART.Epi_Cases_Export (in=EPI)
5869        SASCHART.Arias_Contacts_Trace(in=ARIAS);
5870  by Date;
5871  If EPI = ARIAS;
5872  run;

WARNING: Multiple lengths were specified for the variable County by input data set(s). This can cause
         truncation of data.
NOTE: There were 362559 observations read from the data set SASCHART.EPI_CASES_EXPORT.
NOTE: There were 167054 observations read from the data set SASCHART.ARIAS_CONTACTS_TRACE.
NOTE: The data set SASCHART.EPI_CASES_ARIAS_CONTACTS_MERGE has 0 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.22 seconds
      cpu time            0.23 seconds


The log notes multiple lengths for county which I will fix. 

Since I was merging by Date I checked what that the date part looks like

 

Data SASCHART.Arias_Contacts_Trace;
  Set SASCHART.Arias_Contacts_Trace;
  Date1 = datepart(Date);
  run;

The output looks like

 

                                                                Date                         Date1

                                                           11SEP20:09:19:55        22169  (this is a representative example)

 

I would like date to show as  mmmyy.   So a format like mmmyy. would work? 

A consistent format would likely permit the merge to execute?

 

Thank you for your help in diagnosing these issues.

 

wklierman

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

If you want to reference a variable you are creating in the SELECT statement then use the CALCULATED keyword.  PS always use 4 digits for years.

where '01MAR2020'd <= calculated Date <= '05NOV2021'd

Or use the original variable and datetime literals instead of date literals.

where '01MAR2020:00:00'dt <= Created_On < '06NOV2021:00:00'dt

View solution in original post

6 REPLIES 6
Reeza
Super User

Formats don't matter when you're merging by dates.

However, the datepart in the filter condition but doesn't change the underlying variable, it remains a datetime variable. 

Modify the SQL so that the datetimes are converted to dates.

 

Cases ***;

Proc sql;
Create Table SASCHART.Epi_Cases_Export AS
SELECT  CaseID,
        CaseStatus,
        County, 
		datepart(TrueCaseDate) AS Date label="Date" format=date9.
FROM SASCHART.EpiExport_Cases_All_Fields
Where '01MAR20'd <= datepart(Date) <= '05NOV21'd
Order by Date;
quit;


Contacts ***;

Proc sql;
Create Table SASCHART.Arias_Contacts_Trace AS
SELECT  Contact_Person_ID,
        CaseStatus,
        County_1 AS County label = "County", 
		datepart(Created_On) as Date label="Date" format=date9.
FROM SASCHART.Arias_EpiCase_AriasContacts
Where '01MAR20'd <= datepart(Date) <= '05NOV21'd /*only filters the date, no actual date conversion on your data happens*/
And CaseStatus ne " "
Order by Date;
quit;

Then the rest of your code should work as expected.

 


@wlierman wrote:

I have two datasets for covid cases and covid contacts

 

Both datasets have date fields with values like the following

 

                   11SEP20:09:19:55

                   11SEP20:08:48:00  and so on through

 

                    26OCT21:00:00:00

 

What I would like to do is extract the datepart and then apply a format that could be used as a BY variable in a sort of each data set and then a merge.

 

I set up two sql queries

Cases ***;

Proc sql;
Create Table SASCHART.Epi_Cases_Export AS
SELECT  CaseID,
        CaseStatus,
        County, 
		TrueCaseDate AS Date label="Date"
FROM SASCHART.EpiExport_Cases_All_Fields
Where '01MAR20'd <= datepart(Date) <= '05NOV21'd
Order by Date;
quit;


Contacts ***;

Proc sql;
Create Table SASCHART.Arias_Contacts_Trace AS
SELECT  Contact_Person_ID,
        CaseStatus,
        County_1 AS County label = "County", 
		Created_On AS Date label="Date"
FROM SASCHART.Arias_EpiCase_AriasContacts
Where '01MAR20'd <= datepart(Date) <= '05NOV21'd
And CaseStatus ne " "
Order by Date;
quit;

But the merge

Data SASCHART.EPI_Cases_ARIAS_Contacts_Merge;
Merge SASCHART.Epi_Cases_Export (in=EPI)
      SASCHART.Arias_Contacts_Trace(in=ARIAS);
by Date;
If EPI = ARIAS;
run;

did not produce any match. 

 Data SASCHART.EPI_Cases_ARIAS_Contacts_Merge;
5868  Merge SASCHART.Epi_Cases_Export (in=EPI)
5869        SASCHART.Arias_Contacts_Trace(in=ARIAS);
5870  by Date;
5871  If EPI = ARIAS;
5872  run;

WARNING: Multiple lengths were specified for the variable County by input data set(s). This can cause
         truncation of data.
NOTE: There were 362559 observations read from the data set SASCHART.EPI_CASES_EXPORT.
NOTE: There were 167054 observations read from the data set SASCHART.ARIAS_CONTACTS_TRACE.
NOTE: The data set SASCHART.EPI_CASES_ARIAS_CONTACTS_MERGE has 0 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.22 seconds
      cpu time            0.23 seconds


The log notes multiple lengths for county which I will fix. 

Since I was merging by Date I checked what that the date part looks like

 

Data SASCHART.Arias_Contacts_Trace;
  Set SASCHART.Arias_Contacts_Trace;
  Date1 = datepart(Date);
  run;

The output looks like

 

                                                                Date                         Date1

                                                           11SEP20:09:19:55        22169  (this is a representative example)

 

I would like date to show as  mmmyy.   So a format like mmmyy. would work? 

A consistent format would likely permit the merge to execute?

 

Thank you for your help in diagnosing these issues.

 

wklierman

 

 

 

 


 

wlierman
Lapis Lazuli | Level 10
Data SASCHART.Arias_Contacts_Trace;
5874    Set SASCHART.Arias_Contacts_Trace;
5875    Date1 = datepart(Date);
5876    run;

NOTE: There were 167054 observations read from the data set SASCHART.ARIAS_CONTACTS_TRACE.
NOTE: The data set SASCHART.ARIAS_CONTACTS_TRACE has 167054 observations and 5 variables.
NOTE: Compressing data set SASCHART.ARIAS_CONTACTS_TRACE decreased size by 28.78 percent.
      Compressed is 146 pages; un-compressed would require 205 pages.
NOTE: DATA statement used (Total process time):
      real time           0.14 seconds
      cpu time            0.14 seconds


5877  Proc sql;
5878  Create Table SASCHART.Epi_Cases_Export AS
5879  SELECT  CaseID,
5880          CaseStatus,
5881          County,
5882                datepart(TrueCaseDate) AS Date label="Date" format=date9.
5883  FROM SASCHART.EpiExport_Cases_All_Fields
5884  Where '01MAR20'd <= datepart(Date) <= '05NOV21'd
5885  Order by Date;
ERROR: Function DATEPART requires a numeric expression as argument 1.
ERROR: Expression using less than or equal (<=) has components that are of different data types.
ERROR: Function DATEPART requires a numeric expression as argument 1.
ERROR: Expression using less than or equal (<=) has components that are of different data types.
ERROR: The following columns were not found in the contributing tables: Date.
5886  quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds


5887  DATA SASCHART.ARIAS_EPICASE_ARIASCONTACTS;
5888   Set SASCDC_2.ARIAS_CNTY_RC_ACTIVE_1A;
5889  run;

NOTE: There were 197940 observations read from the data set SASCDC_2.ARIAS_CNTY_RC_ACTIVE_1A.
NOTE: The data set SASCHART.ARIAS_EPICASE_ARIASCONTACTS has 197940 observations and 75 variables.
NOTE: Compressing data set SASCHART.ARIAS_EPICASE_ARIASCONTACTS decreased size by 84.46 percent.
      Compressed is 769 pages; un-compressed would require 4949 pages.
NOTE: DATA statement used (Total process time):
      real time           1.16 seconds
      cpu time            1.17 seconds


5890  Proc import datafile = "I:\Health
5890! Analytics\AMB_INFORMATICS\EPI_Export__10.28.2021\COVID_19_EpiExport1027.xlsx"
5891              out = SASCHART.EpiExport_Cases_All_Fields
5892              dbms = xlsx;
NOTE: The previous statement has been deleted.
5893              sheet = 'Sheet2'
5894              replace;
                  -------
                  22
                  202
ERROR 22-322: Expecting ;.

ERROR 202-322: The option or parameter is not recognized and will be ignored.

5895              getnames=yes;
5896              run;

NOTE: Import cancelled.  Output dataset SASCHART.EPIEXPORT_CASES_ALL_FIELDS already exists.  Specify
      REPLACE option to overwrite it.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE IMPORT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


5897  Proc sql;
5898  Create Table SASCHART.Epi_Cases_Export AS
5899  SELECT  CaseID,
5900          CaseStatus,
5901          County,
5902                datepart(TrueCaseDate) AS Date label="Date" format=date9.
5903  FROM SASCHART.EpiExport_Cases_All_Fields
5904  Where '01MAR20'd <= datepart(Date) <= '05NOV21'd
5905  Order by Date;
ERROR: Function DATEPART requires a numeric expression as argument 1.
ERROR: Expression using less than or equal (<=) has components that are of different data types.
ERROR: Function DATEPART requires a numeric expression as argument 1.
ERROR: Expression using less than or equal (<=) has components that are of different data types.
ERROR: The following columns were not found in the contributing tables: Date.
5906  quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


5907  Proc sql;
5908  Create Table SASCHART.Arias_Contacts_Trace AS
5909  SELECT  Contact_Person_ID,
5910          CaseStatus,
5911          County_1 AS County label = "County",
5912                   datepart(Created_On) AS Date label="Date" format=date9.
5913  FROM SASCHART.Arias_EpiCase_AriasContacts
5914  Where '01MAR20'd <= datepart(Date) <= '05NOV21'd
5915  And CaseStatus ne " "
5916  Order by Date;
ERROR: Function DATEPART requires a numeric expression as argument 1.
ERROR: Expression using less than or equal (<=) has components that are of different data types.
ERROR: Function DATEPART requires a numeric expression as argument 1.
ERROR: Expression using less than or equal (<=) has components that are of different data types.
ERROR: The following columns were not found in the contributing tables: Date.
5917  quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds




I included the log but can't do much with it to clean it up so it reads better.
Above in the proc import the replace; statement appears in error but the query produced the table.
Thank you for your help.



Reeza
Super User
 Create Table SASCHART.Epi_Cases_Export AS
 SELECT  t1.CaseID,
         CaseStatus,
          County,
                datepart(TrueCaseDate) AS Date label="Date" format=date9.
  FROM SASCHART.EpiExport_Cases_All_Fields 
  Where '01MAR20'd <= Date <= '05NOV21'd
  Order by Date;

I missed that you were using the new variable in the filter, so since it's already been converted you need to remove the date parts. I'll assume you can fix the rest of your code with this pointer. 

 

Here's a great, but longer and in depth, reference for dates and times in SAS
https://communities.sas.com/t5/SAS-Communities-Library/Working-with-Dates-and-Times-in-SAS-Tutorial/...

 

wlierman
Lapis Lazuli | Level 10
Thank you. There is still just a small error - that Date was not found in the contributing table. But I created Date. I will add Date.


 Proc sql;
5939  Create Table SASCHART.Arias_Contacts_Trace AS
5940  SELECT  Contact_Person_ID,
5941          CaseStatus,
5942          County_1 AS County label = "County",
5943                   datepart(Created_On) AS Date label="Date" format=date9.
5944  FROM SASCHART.Arias_EpiCase_AriasContacts
5945  Where '01MAR20'd <= Date <= '05NOV21'd
5946  And CaseStatus ne " "
5947  Order by Date;
ERROR: The following columns were not found in the contributing tables: Date.
5948  quit;


Thank you for your help.

Tom
Super User Tom
Super User

If you want to reference a variable you are creating in the SELECT statement then use the CALCULATED keyword.  PS always use 4 digits for years.

where '01MAR2020'd <= calculated Date <= '05NOV2021'd

Or use the original variable and datetime literals instead of date literals.

where '01MAR2020:00:00'dt <= Created_On < '06NOV2021:00:00'dt
wlierman
Lapis Lazuli | Level 10

Thank you for the direction using Calculated date in the where statement.

It cleared up the problem.

 

Thank you

 

wlierman

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1166 views
  • 4 likes
  • 3 in conversation