DATA Step, Macro, Functions and more

Break up string using count function and how to loop through it

Accepted Solution Solved
Reply
Super Contributor
Posts: 405
Accepted Solution

Break up string using count function and how to loop through it

Hello,

 

I'm trying to take this sql script string and break it up into column rows to start.  We are scripting out the sql from oracle but we want to create scripts that are formatted a specific way.

 

CREATE TABLE "TEMP" ("TEMPID" NUMBER(7,0), "TESTID" NUMBER(6,0), "TYPE" VARCHAR2(5), "STARTDT" DATE, "SOURCE" VARCHAR2(5), "TITLE" VARCHAR2(6), CONSTRAINT "TEMP_PK" PRIMARY KEY ("TEMPID") ENABLE,  CONSTRAINT "TEMP_FK" FOREIGN KEY ("KEYID") REFERENCES "KEY" ("ID") ENABLE) ;

 

Output:

TABNm COLNm COLType

TEMP TEMPID NUMBER(7,0)

TEMP TESTID NUMBER(6,0)

TEMP TYPE VARCHAR2(5)

TEMP STARTDT DATE

TEMP SOURCE VARCHAR2(5)

TEMP TITLE VARCHAR2(6)

 

This is what I have so far.  I try to isolate just the variables and then take a count of double quotes and divide it by 2 to get the number of variables, but I'm not sure how to move each variable into a separate row.

 

data x ;
	set raw ;
	if f1=1 then do ;
		tabname=scan(line,2,'"') ;
		ParenPos=find(line,'(') ;		
		x=substr(line,ParenPos) ;
		EndPos=find(x,'CONSTRAINT') ;
		if EndPos = 0 then EndPos=find(x,';') ;
		y=substr(x,1,EndPos-1) ;
		VarCnt=count(y,'"')/2 ;
	end ;	
run ;

 


Accepted Solutions
Solution
‎04-18-2018 10:07 AM
Contributor
Posts: 47

Re: Break up string using count function and how to loop through it

Posted in reply to jerry898969

Hello Jerry

The following code splits the create statement and extracts Column names and types. The Table name is extracted as the third word using SCAN, and the Column definitions are extracted using a regular expression, that looks for  a word in quotes followed by one space followed by an unquoted word optionally followed by a block containing only digits, parantheses or comma. The expression also finds the keys, and they are omitted because the  words PRIMARY and FOREIGN are not valid Column types.

 


data have;
	line = 'CREATE TABLE "TEMP" ("TEMPID" NUMBER(7,0), "TESTID" NUMBER(6,0), "TYPE" VARCHAR2(5), "STARTDT" DATE, "SOURCE" VARCHAR2(5), "TITLE" VARCHAR2(6), CONSTRAINT "TEMP_PK" PRIMARY KEY ("TEMPID") ENABLE,  CONSTRAINT "TEMP_FK" FOREIGN KEY ("KEYID") REFERENCES "KEY" ("ID") ENABLE) ;';
run;

data want (keep=TABNm COLNm COLType); set have;
	length w $200 TABNm COLNm COLType $60;
	TABNm = dequote(scan(line,3,' '));

	xpr = prxparse('/"\w+"\s{1}\w+[\d,\(\)]*/');
	start = 1;
	stop = length(trim(line));
	call prxnext(xpr, start, stop, line, pos, len);
	do while (pos > 0);
		w = substr(line, pos, len-1);
		COLNm = dequote(scan(w,1,' '));
		COLType = dequote(scan(w,2,' '));
		if substr(COLType,1,4) not in ('PRIM','FORE') then output;
		call prxnext(xpr, start, stop, line, pos, len);
	end;
run;

The result is:

TEMP       TEMPID       NUMBER(7,0)
TEMP       TESTID       NUMBER(6,0)
TEMP       TYPE         VARCHAR2(5)
TEMP       STARTDT      DATE
TEMP       SOURCE       VARCHAR2(5)
TEMP       TITLE        VARCHAR2(6)

View solution in original post


All Replies
Super User
Super User
Posts: 9,441

Re: Break up string using count function and how to loop through it

[ Edited ]
Posted in reply to jerry898969

Not sure why you would be wanting to parse a string?  Why not just extract the metadata from the database?  That would be more robust.

E.g.

select * from dictionary.columns where ...;
Super Contributor
Posts: 405

Re: Break up string using count function and how to loop through it

I would love to do that, but I have to follow what my supervisor is requesting.  I'm not sure his thinking on this.  He may think that since it is an oracle DB that we should use Oracle export process to include everything.  He may not think SAS can return everything we may need.

 

Thank you

Super User
Super User
Posts: 9,441

Re: Break up string using count function and how to loop through it

Posted in reply to jerry898969

It doesn't matter if you use the database or SAS, any structured language has metadata about its tables.  In the DB these are held under dictionary tables, in SAS they are under sashelp.vtable and vcolumn (although you can refer to dictionary also in SQL through SAS).  So in either circumstance you should be able to pull metadata without resorting to code parsing.  If you have to use the text, and that is a real pain, then a simpler suggestion is to run it in SAS or DB then extract metadata, that takes all the parsing away.

Solution
‎04-18-2018 10:07 AM
Contributor
Posts: 47

Re: Break up string using count function and how to loop through it

Posted in reply to jerry898969

Hello Jerry

The following code splits the create statement and extracts Column names and types. The Table name is extracted as the third word using SCAN, and the Column definitions are extracted using a regular expression, that looks for  a word in quotes followed by one space followed by an unquoted word optionally followed by a block containing only digits, parantheses or comma. The expression also finds the keys, and they are omitted because the  words PRIMARY and FOREIGN are not valid Column types.

 


data have;
	line = 'CREATE TABLE "TEMP" ("TEMPID" NUMBER(7,0), "TESTID" NUMBER(6,0), "TYPE" VARCHAR2(5), "STARTDT" DATE, "SOURCE" VARCHAR2(5), "TITLE" VARCHAR2(6), CONSTRAINT "TEMP_PK" PRIMARY KEY ("TEMPID") ENABLE,  CONSTRAINT "TEMP_FK" FOREIGN KEY ("KEYID") REFERENCES "KEY" ("ID") ENABLE) ;';
run;

data want (keep=TABNm COLNm COLType); set have;
	length w $200 TABNm COLNm COLType $60;
	TABNm = dequote(scan(line,3,' '));

	xpr = prxparse('/"\w+"\s{1}\w+[\d,\(\)]*/');
	start = 1;
	stop = length(trim(line));
	call prxnext(xpr, start, stop, line, pos, len);
	do while (pos > 0);
		w = substr(line, pos, len-1);
		COLNm = dequote(scan(w,1,' '));
		COLType = dequote(scan(w,2,' '));
		if substr(COLType,1,4) not in ('PRIM','FORE') then output;
		call prxnext(xpr, start, stop, line, pos, len);
	end;
run;

The result is:

TEMP       TEMPID       NUMBER(7,0)
TEMP       TESTID       NUMBER(6,0)
TEMP       TYPE         VARCHAR2(5)
TEMP       STARTDT      DATE
TEMP       SOURCE       VARCHAR2(5)
TEMP       TITLE        VARCHAR2(6)
Super Contributor
Posts: 405

Re: Break up string using count function and how to loop through it

Posted in reply to ErikLund_Jensen
Thank you everyone. Erik this helped a lot.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 89 views
  • 0 likes
  • 3 in conversation