BookmarkSubscribeRSS Feed
keen_sas
Quartz | Level 8

This is an SQL condition for joining two datasets separated in a variable text as below.


text="AB.name = dsn.pname and AB.ID = dsn.varID and AB.custid=dsn.customerid" ;

 

I need to pick the variables from dataset dsn only and store all the dsn variables from this condition into another variable. I have tried the following code using array and do loop , but something is missing from the second iteration. Variables are not created
from second iteration, as a result values are also not created. May i know what is wrong in this or is their any other short cut method to
generate this . I want the final output like this.(all variables in DSN dataset presnet in the text condition are separated) 

Output=pname varid customerid

 

data a;
set text ;
text="AB.name = dsn.pname and AB.ID = dsn.varID and AB.custid=dsn.customerid";
mod='i';
domain="dsn";
cnt=count(text,strip(domain));
array str {3} $40. ;
array fdf {3};
do i = 1 to 3 ;
fdf[i]=find(text,strip(domain),mod,fdf[i]+i) ;
str{i}=scan(substr(text,fdf(i)+1),2,". ","M") ;
end ;
outvar=catx(' ',str:) ;
run ;

 

Any help would be appreciated 

3 REPLIES 3
s_lassen
Meteorite | Level 14

Pearl Regular Expressions are probably the easiest, e.g.:

data test;
  text="AB.name = dsn.pname and AB.ID = dsn.varID and AB.custid=dsn.customerid";
  prxid=prxparse('/(?<=\bdsn\.)\S*/i');
  start=1;
  pos=0;
  len=0;
  do until(0);
    call prxnext(prxid,start,-1,text,pos,len);
    if len=0 then leave;
    outvar=substr(text,pos,len);
    output;
    end;
run;

An explanation of the search string in PRXPARSE: "(?<=\bdsn\.)" is a look-behind assertion: it has to be there but is not included in the output. The meaning is a word boundary (\b) plus the text "dsn." ("." has to be escaped with a "\"). After that comes "\S*", meaning zero or more non whitespace characters, which is the string you are looking for. the final "i" just means that the search is case independent, the search will find "DSN." as well as "dsn.".

keen_sas
Quartz | Level 8
Thank you S_Lassen for quick response. I have one query with respect to PRXPARSE. If i am not sure what exactly is the name of the dataset , the name of the dataset anything between A- Z and sometimes it is inclusive of _ also like AB_C..In general if i want to include all the alphabets along with _ how do i parse these in PRXPARSE. When i try to include all of the alphabets its not working. Can you please suggest.
s_lassen
Meteorite | Level 14

I assumed that you could just look for non-whitespace, but of course your SQL expression may look like e.g. "a.col1=dsn.col4&a.col3...", without whitespace. So you may want to use "\w" (word character) instead of "\S" (non whitespace). You could also check for the first character being non-numeric, but is that really necessary, if your input is valid SQL code?

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 523 views
  • 0 likes
  • 2 in conversation