I am trying to create a report based on two datasets (Source and Lookup tables).
There is a source table that contains the source data. I need to parse out the Domain, Field Name, and Field Label from each row of string in the source table and write them out to report;
There is also a look-up table that contains exact values of Domain, Field Name and Field Label that I need to parse out from each row of the source table.
For instance:
One row of source string
CM_ING T21, , LCHGDATE ( DATETIME20 ))^n | DATE/TIME OF LAST VISIT
One row of values from the lookup table:
CM_ING(Note: Domain), LCHGDATE(Note: Field Name), DATE/TIME OF LAST VISIT(Note: Field Label) in one of three columns.
How could I best do this efficiently? Any help is greatly appreciate it!
Thank you in advance for your input / help!
If I understand correctly, the lookup table is used to filter the main table.
If so, you can do this:
proc sql;
select source.*
from SOURCE
inner join LOOKUP
on source.CM_ING = lookup.CM_ING
and source.LCHGDATE = lookup.LCHGDATE
and source.VISITDATE = lookup.VISITDATE ;
Thank you, Chris!
I did not describe very clearly what I wanted the program to do. Here may be the code to show the program logic:
DATA REPORT;
SET SOURCE;
Loop through LOOKUP table ...
TMP_DOMAIN=LOOKUP.DOMAIN;
TMP_NAME=LOOKUP.FIELDNAME;
TMP_LABLE=LOOKUP.LABEL;
IF ( INDEX(SOURCE.STRINGFIELD, TMP_DOMAIN) > 0 AND INDEX(SOURCE.STRINGFIELD, TMP_NAME) > 0)
THEN DO;
SOURCE.DOMAIN=TMP_DOMAIN;
SOURCE.FIELDNAME=LOOKUP.FIELDNAME;
SOURCE.FIELDLABEL=LOOKUP.LABEL;
END;
NEXT ROW (LOOKUP);
RUN;
In essence, if the source string contains the value in Lookup table, takes the domain, field name and field label value and add them to the Source table (parsing the domains, field names and field labels from the string of the source table).
I am not sure how to code the looping of LOOKUP table; I am also afraid the code may be very inefficient looping and compare each row in SOURCE with that in LOOKUP.
Thanks again for your input and help!
ZZ
Your pseudo is still confusing.
Since this isn't a key join (because you use the index function) the default is a Cartesian product join, which is expensive.
Is the performance of something like this acceptable?
proc sql;
create table REPORT as
select coalesce(LOOKUP.FIELDNAME, SOURCE.FIELDNAME) as FIELDNAME
,coalesce(LOOKUP.DOMAIN , SOURCE.DOMAIN ) as DOMAIN
,coalesce(LOOKUP.LABEL , SOURCE.LABEL ) as LABEL
from SOURCE
left join
LOOKUP
on index(SOURCE.FIELDNAME, LOOKUP.FIELDNAME) > 0
and index(SOURCE.DOMAIN , LOOKUP.DOMAIN ) > 0;
quit;
Hi Chris,
I think I did not make it clear that there is no SOURCE.DOMAIN, SOURCE.FIELDNAME, and SORUCE.FIELDLABEL fields in the SOURCE table. The source table has a string field, by looking in the source string (INDEX) for the domain / fieldname values from the LOOKUP table , I know what the value will be for these fields in the report - those matching LOOKUP fields.
Instead of 'parsing out' the value that matched the field values from LOOKUP, I will take the values from LOOKUP directly.
I hope the below make sense:
proc sql;
create table REPORT as
/*ZZ Will this line of code evaluate to LOOKUP.DOMAIN?*/
select coalesce( SOURCE.DATA_STRING, LOOKUP.DOMAIN ) as DOMAIN
/*ZZ Will this line of code evaluate to LOOKUP.FIELDNAME?*/
, coalesce( SOURCE.DATA_STRING, LOOKUP.FIELDNAME ) as FIELD_NAME
/*ZZ Will this line of code evaluate to LOOKUP.LABEL?*/
, coalesce( SOURCE.DATA_STRING, LOOKUP.LABEL ) as FIELD_LABEL
from SOURCE
left join LOOKUP
on index(SOURCE.DATA_STRING, LOOKUP.FIELDNAME) > 0
and index(SOURCE.DATA_STRING, LOOKUP.DOMAIN ) > 0;
quit;
Thanks again!
ZZ
I know it will be inefficient. I can live with it for now
Looking up 'coalesce', should I switch LOOKUP with SOURCE.DATA_STRING in the code, like below?
proc sql;
create table REPORT as
/*ZZ Will this line of code evaluate to LOOKUP.DOMAIN?*/
select coalesce( LOOKUP.DOMAIN, SOURCE.DATA_STRING ) as DOMAIN
/*ZZ Will this line of code evaluate to LOOKUP.FIELDNAME?*/
, coalesce( LOOKUP.FIELDNAME, SOURCE.DATA_STRING ) as FIELD_NAME
/*ZZ Will this line of code evaluate to LOOKUP.LABEL?*/
, coalesce( LOOKUP.LABEL, SOURCE.DATA_STRING ) as FIELD_LABEL
from SOURCE
left join LOOKUP
on index(SOURCE.DATA_STRING, LOOKUP.FIELDNAME) > 0
and index(SOURCE.DATA_STRING, LOOKUP.DOMAIN ) > 0;
quit;
Thanks again!
ZZ
You should.
Just give it a go and see if you get what you expect.
Hi Chris,
I tested the code. Unfortunately, it did not work. I am trying to update the code. Will let you know as soon as I get it to work. Thanks again, and a good day to you!
zz
Yes let us know.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.