- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to create a report based on two datasets (Source and Lookup tables).
There is a source table that contains the source data. I need to parse out the Domain, Field Name, and Field Label from each row of string in the source table and write them out to report;
There is also a look-up table that contains exact values of Domain, Field Name and Field Label that I need to parse out from each row of the source table.
For instance:
One row of source string
CM_ING T21, , LCHGDATE ( DATETIME20 ))^n | DATE/TIME OF LAST VISIT
One row of values from the lookup table:
CM_ING(Note: Domain), LCHGDATE(Note: Field Name), DATE/TIME OF LAST VISIT(Note: Field Label) in one of three columns.
How could I best do this efficiently? Any help is greatly appreciate it!
Thank you in advance for your input / help!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If I understand correctly, the lookup table is used to filter the main table.
If so, you can do this:
proc sql;
select source.*
from SOURCE
inner join LOOKUP
on source.CM_ING = lookup.CM_ING
and source.LCHGDATE = lookup.LCHGDATE
and source.VISITDATE = lookup.VISITDATE ;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, Chris!
I did not describe very clearly what I wanted the program to do. Here may be the code to show the program logic:
DATA REPORT;
SET SOURCE;
Loop through LOOKUP table ...
TMP_DOMAIN=LOOKUP.DOMAIN;
TMP_NAME=LOOKUP.FIELDNAME;
TMP_LABLE=LOOKUP.LABEL;
IF ( INDEX(SOURCE.STRINGFIELD, TMP_DOMAIN) > 0 AND INDEX(SOURCE.STRINGFIELD, TMP_NAME) > 0)
THEN DO;
SOURCE.DOMAIN=TMP_DOMAIN;
SOURCE.FIELDNAME=LOOKUP.FIELDNAME;
SOURCE.FIELDLABEL=LOOKUP.LABEL;
END;
NEXT ROW (LOOKUP);
RUN;
In essence, if the source string contains the value in Lookup table, takes the domain, field name and field label value and add them to the Source table (parsing the domains, field names and field labels from the string of the source table).
I am not sure how to code the looping of LOOKUP table; I am also afraid the code may be very inefficient looping and compare each row in SOURCE with that in LOOKUP.
Thanks again for your input and help!
ZZ
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Your pseudo is still confusing.
Since this isn't a key join (because you use the index function) the default is a Cartesian product join, which is expensive.
Is the performance of something like this acceptable?
proc sql;
create table REPORT as
select coalesce(LOOKUP.FIELDNAME, SOURCE.FIELDNAME) as FIELDNAME
,coalesce(LOOKUP.DOMAIN , SOURCE.DOMAIN ) as DOMAIN
,coalesce(LOOKUP.LABEL , SOURCE.LABEL ) as LABEL
from SOURCE
left join
LOOKUP
on index(SOURCE.FIELDNAME, LOOKUP.FIELDNAME) > 0
and index(SOURCE.DOMAIN , LOOKUP.DOMAIN ) > 0;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Chris,
I think I did not make it clear that there is no SOURCE.DOMAIN, SOURCE.FIELDNAME, and SORUCE.FIELDLABEL fields in the SOURCE table. The source table has a string field, by looking in the source string (INDEX) for the domain / fieldname values from the LOOKUP table , I know what the value will be for these fields in the report - those matching LOOKUP fields.
Instead of 'parsing out' the value that matched the field values from LOOKUP, I will take the values from LOOKUP directly.
I hope the below make sense:
proc sql;
create table REPORT as
/*ZZ Will this line of code evaluate to LOOKUP.DOMAIN?*/
select coalesce( SOURCE.DATA_STRING, LOOKUP.DOMAIN ) as DOMAIN
/*ZZ Will this line of code evaluate to LOOKUP.FIELDNAME?*/
, coalesce( SOURCE.DATA_STRING, LOOKUP.FIELDNAME ) as FIELD_NAME
/*ZZ Will this line of code evaluate to LOOKUP.LABEL?*/
, coalesce( SOURCE.DATA_STRING, LOOKUP.LABEL ) as FIELD_LABEL
from SOURCE
left join LOOKUP
on index(SOURCE.DATA_STRING, LOOKUP.FIELDNAME) > 0
and index(SOURCE.DATA_STRING, LOOKUP.DOMAIN ) > 0;
quit;
Thanks again!
ZZ
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I know it will be inefficient. I can live with it for now
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Looking up 'coalesce', should I switch LOOKUP with SOURCE.DATA_STRING in the code, like below?
proc sql;
create table REPORT as
/*ZZ Will this line of code evaluate to LOOKUP.DOMAIN?*/
select coalesce( LOOKUP.DOMAIN, SOURCE.DATA_STRING ) as DOMAIN
/*ZZ Will this line of code evaluate to LOOKUP.FIELDNAME?*/
, coalesce( LOOKUP.FIELDNAME, SOURCE.DATA_STRING ) as FIELD_NAME
/*ZZ Will this line of code evaluate to LOOKUP.LABEL?*/
, coalesce( LOOKUP.LABEL, SOURCE.DATA_STRING ) as FIELD_LABEL
from SOURCE
left join LOOKUP
on index(SOURCE.DATA_STRING, LOOKUP.FIELDNAME) > 0
and index(SOURCE.DATA_STRING, LOOKUP.DOMAIN ) > 0;
quit;
Thanks again!
ZZ
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You should.
Just give it a go and see if you get what you expect.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Chris,
I tested the code. Unfortunately, it did not work. I am trying to update the code. Will let you know as soon as I get it to work. Thanks again, and a good day to you!
zz
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes let us know.