BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jerry898969
Pyrite | Level 9

Hello,

 

I'm trying to take this sql script string and break it up into column rows to start.  We are scripting out the sql from oracle but we want to create scripts that are formatted a specific way.

 

CREATE TABLE "TEMP" ("TEMPID" NUMBER(7,0), "TESTID" NUMBER(6,0), "TYPE" VARCHAR2(5), "STARTDT" DATE, "SOURCE" VARCHAR2(5), "TITLE" VARCHAR2(6), CONSTRAINT "TEMP_PK" PRIMARY KEY ("TEMPID") ENABLE,  CONSTRAINT "TEMP_FK" FOREIGN KEY ("KEYID") REFERENCES "KEY" ("ID") ENABLE) ;

 

Output:

TABNm COLNm COLType

TEMP TEMPID NUMBER(7,0)

TEMP TESTID NUMBER(6,0)

TEMP TYPE VARCHAR2(5)

TEMP STARTDT DATE

TEMP SOURCE VARCHAR2(5)

TEMP TITLE VARCHAR2(6)

 

This is what I have so far.  I try to isolate just the variables and then take a count of double quotes and divide it by 2 to get the number of variables, but I'm not sure how to move each variable into a separate row.

 

data x ;
	set raw ;
	if f1=1 then do ;
		tabname=scan(line,2,'"') ;
		ParenPos=find(line,'(') ;		
		x=substr(line,ParenPos) ;
		EndPos=find(x,'CONSTRAINT') ;
		if EndPos = 0 then EndPos=find(x,';') ;
		y=substr(x,1,EndPos-1) ;
		VarCnt=count(y,'"')/2 ;
	end ;	
run ;

 

1 ACCEPTED SOLUTION

Accepted Solutions
ErikLund_Jensen
Rhodochrosite | Level 12

Hello Jerry

The following code splits the create statement and extracts Column names and types. The Table name is extracted as the third word using SCAN, and the Column definitions are extracted using a regular expression, that looks for  a word in quotes followed by one space followed by an unquoted word optionally followed by a block containing only digits, parantheses or comma. The expression also finds the keys, and they are omitted because the  words PRIMARY and FOREIGN are not valid Column types.

 


data have;
	line = 'CREATE TABLE "TEMP" ("TEMPID" NUMBER(7,0), "TESTID" NUMBER(6,0), "TYPE" VARCHAR2(5), "STARTDT" DATE, "SOURCE" VARCHAR2(5), "TITLE" VARCHAR2(6), CONSTRAINT "TEMP_PK" PRIMARY KEY ("TEMPID") ENABLE,  CONSTRAINT "TEMP_FK" FOREIGN KEY ("KEYID") REFERENCES "KEY" ("ID") ENABLE) ;';
run;

data want (keep=TABNm COLNm COLType); set have;
	length w $200 TABNm COLNm COLType $60;
	TABNm = dequote(scan(line,3,' '));

	xpr = prxparse('/"\w+"\s{1}\w+[\d,\(\)]*/');
	start = 1;
	stop = length(trim(line));
	call prxnext(xpr, start, stop, line, pos, len);
	do while (pos > 0);
		w = substr(line, pos, len-1);
		COLNm = dequote(scan(w,1,' '));
		COLType = dequote(scan(w,2,' '));
		if substr(COLType,1,4) not in ('PRIM','FORE') then output;
		call prxnext(xpr, start, stop, line, pos, len);
	end;
run;

The result is:

TEMP       TEMPID       NUMBER(7,0)
TEMP       TESTID       NUMBER(6,0)
TEMP       TYPE         VARCHAR2(5)
TEMP       STARTDT      DATE
TEMP       SOURCE       VARCHAR2(5)
TEMP       TITLE        VARCHAR2(6)

View solution in original post

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Not sure why you would be wanting to parse a string?  Why not just extract the metadata from the database?  That would be more robust.

E.g.

select * from dictionary.columns where ...;
jerry898969
Pyrite | Level 9

I would love to do that, but I have to follow what my supervisor is requesting.  I'm not sure his thinking on this.  He may think that since it is an oracle DB that we should use Oracle export process to include everything.  He may not think SAS can return everything we may need.

 

Thank you

RW9
Diamond | Level 26 RW9
Diamond | Level 26

It doesn't matter if you use the database or SAS, any structured language has metadata about its tables.  In the DB these are held under dictionary tables, in SAS they are under sashelp.vtable and vcolumn (although you can refer to dictionary also in SQL through SAS).  So in either circumstance you should be able to pull metadata without resorting to code parsing.  If you have to use the text, and that is a real pain, then a simpler suggestion is to run it in SAS or DB then extract metadata, that takes all the parsing away.

ErikLund_Jensen
Rhodochrosite | Level 12

Hello Jerry

The following code splits the create statement and extracts Column names and types. The Table name is extracted as the third word using SCAN, and the Column definitions are extracted using a regular expression, that looks for  a word in quotes followed by one space followed by an unquoted word optionally followed by a block containing only digits, parantheses or comma. The expression also finds the keys, and they are omitted because the  words PRIMARY and FOREIGN are not valid Column types.

 


data have;
	line = 'CREATE TABLE "TEMP" ("TEMPID" NUMBER(7,0), "TESTID" NUMBER(6,0), "TYPE" VARCHAR2(5), "STARTDT" DATE, "SOURCE" VARCHAR2(5), "TITLE" VARCHAR2(6), CONSTRAINT "TEMP_PK" PRIMARY KEY ("TEMPID") ENABLE,  CONSTRAINT "TEMP_FK" FOREIGN KEY ("KEYID") REFERENCES "KEY" ("ID") ENABLE) ;';
run;

data want (keep=TABNm COLNm COLType); set have;
	length w $200 TABNm COLNm COLType $60;
	TABNm = dequote(scan(line,3,' '));

	xpr = prxparse('/"\w+"\s{1}\w+[\d,\(\)]*/');
	start = 1;
	stop = length(trim(line));
	call prxnext(xpr, start, stop, line, pos, len);
	do while (pos > 0);
		w = substr(line, pos, len-1);
		COLNm = dequote(scan(w,1,' '));
		COLType = dequote(scan(w,2,' '));
		if substr(COLType,1,4) not in ('PRIM','FORE') then output;
		call prxnext(xpr, start, stop, line, pos, len);
	end;
run;

The result is:

TEMP       TEMPID       NUMBER(7,0)
TEMP       TESTID       NUMBER(6,0)
TEMP       TYPE         VARCHAR2(5)
TEMP       STARTDT      DATE
TEMP       SOURCE       VARCHAR2(5)
TEMP       TITLE        VARCHAR2(6)
jerry898969
Pyrite | Level 9
Thank you everyone. Erik this helped a lot.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 964 views
  • 0 likes
  • 3 in conversation