BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Stretlow
Obsidian | Level 7

Hi There.

 

I've started using a new DWH at work and they've introduced lookup tabels for practically everything.

 

I have a large dataset that contains a number (key) that links to a date that SAS is reading as character (theyre using SQL and dont know why its not recognising it as a date)

 

What I want to do conditionally search the large table by date so for example.

 

Lookup table has two variables (date is formatted as character)

 

DateKey     TranDate

1               |  2016-01-01

2               |  2016-01-02

3               |  2016-01-02

 

data set has numerous variables, one being the datekey which is formated as 6.

 

I want to merge them together but might add a condition such as:-

 

Where = (Trandate >= '01Jan2016'd);

 

is this possible? the only way I can do it is to read the entire dataset in then merge it with the lookup table but this obviously takes time due to the size of the table.

 

Any help greatly appreciated.

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

To  filter records where accountkey is missing is easy, but I need to know which table holds this variable.

Let's say its origin is the B table, then the full code will be:

 

proc sql;
create table Payments as select
a.*, b.*
from work.DateKey(where=(date>'28Aug2016'd)) as A
left join DWH.vwFactSalaryTransactions(where=(account_key ne .)) as B
on (a.Datekey = b.Datekey)
; quit;

 

if accountkey is in the lookup table (A) then:

 

 

proc sql;
create table Payments as select
a.*, b.*
from work.DateKey(where=(date>'28Aug2016'd  and  account_key ne .)) as A
left join DWH.vwFactSalaryTransactions as B
on (a.Datekey = b.Datekey)
; quit;

 

View solution in original post

27 REPLIES 27
LinusH
Tourmaline | Level 20

The preferre way of using lookup tables for filtering is by using SQL inner joins. The is the usual technique when dealing with star schemas, and similar.

But since your date in the date lookup table (dimension?) is char, you can't use a SAS date constant, since that is considered as numeric. 

You could probably use the same date char format as your lookup table, even if it's not ideal. Is the data stored in SAS data sets? 

Data never sleeps
Stretlow
Obsidian | Level 7

@LinusH wrote:

The preferre way of using lookup tables for filtering is by using SQL inner joins. The is the usual technique when dealing with star schemas, and similar.

But since your date in the date lookup table (dimension?) is char, you can't use a SAS date constant, since that is considered as numeric. 

You could probably use the same date char format as your lookup table, even if it's not ideal. Is the data stored in SAS data sets? 


 

I do have a piece of code that will change the character dates to an actual date format, as the lookup table only consists of about 10000 observations I dont think this would take too long.

 

If I did this and the date was in mmddyy10. could I then use the SQL inner join?

 

If so would you able to provide an example?

 

Really apprecate the help.

 

 

Astounding
PROC Star

A few notes ...

 

Since your character date is in Y-M-D form, you can compare to it directly:

 

where = (TranDate >= '2016-01-01')

 

If you want to compare these character values to the numeric values in a larger data set, that can be done as well.  But it's not clear what that other data set contains.  Does it contain numeric dates, numeric datetimes, something else entirely?  What would it contain that lets it be joined to your lookup table in any way?  DATEKEY?

Stretlow
Obsidian | Level 7

@Astounding wrote:

A few notes ...

 

Since your character date is in Y-M-D form, you can compare to it directly:

 

where = (TranDate >= '2016-01-01')

 

If you want to compare these character values to the numeric values in a larger data set, that can be done as well.  But it's not clear what that other data set contains.  Does it contain numeric dates, numeric datetimes, something else entirely?  What would it contain that lets it be joined to your lookup table in any way?  DATEKEY?


 

Apologies if didnt make it clear.

 

Ok so both tables have a variable called DateKey which is a number and the lookupdate contains the date ( i've just converted the dates in the lookup to mmddyy10. so the character thing isnt an issue now.

 

What I would like to do is join the tabkes together by the variable datekey but only bring back obersvations from the non lookuptable where date >= 01Jan16 etc.

 

The non lookup table has millions of observations so i need to make sure that in the step im joining the tables together I am able to define some conditional criteria.

 

That make sense?

 

 

 

Astounding
PROC Star

Here are the pieces that I'm not sure about.

 

When you say you converted the lookup dates to mmddyy10 format, what does that mean?  Do you have an actual SAS date value (numeric) stored there, with a format applied for printing purposes?

 

When you have a DATE variable in the nonlookup table, what does it contain?  A date value on the SAS date scale?  A datetime value?  A character string?

 

 

Stretlow
Obsidian | Level 7

Hi. Heres the tables i have

 

Lookup Table... Datekey is a numerical format, date is now in dateformat

 

DateKey     Date

1                 01/01/2016

2                 02/01/2016

3                 03/01/2016

4                 04/01/2016

 

Transaction Table Contains multiple variables however also contains the DateKey 

 

DateKey  Var2 Var3 Var4

1

2

3

4

 

I need to join the two tables by datekey but i'd like to include the statment where the date in the joined table is = 03/01/2016. (or any other condition of my choosing)

 

I'm not 100% sure if this is possible

 

Help continues to be appreciated.

 

Astounding
PROC Star

I must be having a slow day.  Here's the part I still don't understand:  "date is now in dateformat"

 

What is actually contained in the field DATE?  A character string?  A numeric value on the SAS date scale?  A numeric value on some other scale that your data warehouse uses?  A numeric value that corresponds to a date-time rather than a date?

 

Whatever the answer, if you want to compare with date fields in the non-lookup table(s), the same questions will apply to those date fields.

Shmuel
Garnet | Level 18

Do I understand right:

 

if TRANDATE was in a SAS date (numeric) format

then you will use :  Where = (Trandate >= '01Jan2016'd);

 

As TRANDATE is a character variable you can use:

 

where = (input(trandate,yymmdd10.) >= '01JAN2016'd);

 

 

Stretlow
Obsidian | Level 7
Sorry it's my terminology.

So it's a sas date scale, a number given a format to show a proper date. This date is only in the lookup table. However both tables have a datekey this key is what the two tables need to be merged by however i only want the tables to merge data where the date is after a certain date.

As the actual date is only in the lookup table I'm not sure if this is possible.

Astounding
PROC Star

OK, I think I'm starting to understand the issues here.  You can't use a standalone WHERE statement, since the date field appears in only one data set.  This program would require the date to be in both data sets:

 

data want;

merge huge_table (in=in1) lookup_table (in=in2);

by datekey;

if in1 and in2;

where TranDate >= '01Mar2016'd;

run;

 

That would generate an error, because TranDate doesn't appear in HUGE_TABLE.

 

Instead, use the WHERE clause as a data set modifier:

 

data want;

merge huge_table (in=in1) lookup_table (in=in2 where=(TranDate >= '01Mar2016'd));

by datekey;

if in1 and in2;

run;

 

All of this assumes you're actually using MERGE to combine the data ... if you are using SQL it can be done but would look a bit different.

 

Are we finally headed in the right direction?

Shmuel
Garnet | Level 18

try:

 

proc sql;

    create table want as select

      a.*, b.*   /* change to desired variables */

     from lookup_table(where=(date='03JAN2016'd)) as a

     left  join transaction_table as b

     on (a.date_key = b.datekey)

  ; quit;

 

Is that what you want ? if not addapt it to your needs.

Stretlow
Obsidian | Level 7
I think we are heading in the right direction my friend.

I'm not at work at the moment but ill give them a shot tomorrow.

Will report back.

Thanks again
Stretlow
Obsidian | Level 7

Ok had a pop at this this morning 

 

proc sql;
create table want as select
a.DateKey, b.Date /* change to desired variables */
from DateKey(where=(date>'03JAN2016'd)) as a
left join Payments as b
on (a.Datekey = b.Datekey)
; quit;

 

but got this error merssage

 

898 proc sql;
899 create table want as select
900 a.DateKey, b.Date /* change to desired variables */
901 from DateKey(where=(date>'03JAN2016'd)) as a
902 left join Payments as b
903 on (a.Datekey = b.Datekey)
904 ;
ERROR: Column Date could not be found in the table/view identified with the correlation name B.
904! quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds

Stretlow
Obsidian | Level 7

Ok changed it to this

 

proc sql;
create table want as select
a.DateKey,Date, b.DateKey, TransactionAmount/* change to desired variables */
from DateKey(where=(date>'28Aug2016'd)) as a
left join Payments as b
on (a.Datekey = b.Datekey)
; quit;

 

and heres the log.

 

953 proc sql;
954 create table want as select
955 a.DateKey,Date, b.DateKey, TransactionAmount/* change to desired variables */
956 from DateKey(where=(date>'28Aug2016'd)) as a
957 left join Payments as b
958 on (a.Datekey = b.Datekey)
959 ;
WARNING: Variable DateKey already exists on file WORK.WANT.
NOTE: Table WORK.WANT created, with 223 rows and 3 columns.

959! quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds

 

However while it has merged the transaction amount variable where the dates are > 28/08/2016 it has still brought back the other variables from the lookup table (i.e where the date is < 28/08/2016 its juts put the transaction amount as blank.

 

All i want to see is where the date is > 28/08/2016, think we're nearly there though 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 27 replies
  • 2480 views
  • 0 likes
  • 4 in conversation