BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jerry898969
Pyrite | Level 9

Hello,

 

I have a word document that is given to me that is a mock up of a project I'm going to work on.  What I want to do is take the list of tables at the start of the document and then scan the document and pull out and instance where the table.variable combination is listed so I can then create a process to check that these table.variable combinations exist before we start the project. 

 

The reason is that the department we work with gives us mocks that don't match the meta data of their tables and we want to resolve it before we start.

 

Attached is a test version of how the document will look.

 


data mock ;
	length text $400 ;
	infile '[path]\temp.txt' length = l ;
	input @1 text $varying400. l ;	
run ;

data tables (drop=flag) ;
	set mock ;
	retain flag ;

	if text = 'FILES:' then flag=1 ;
	if text = ' ' then flag=0 ;
	if text ne 'FILES:' and  flag=1 then output ;
run ;

data tables_ (keep=tabname) ;
	set tables ;
	tabname = scan(strip(text),1) ;
run ;

This code will give me a table with a list of tables to search.  I need to have this list in the final table so I can loop through the dictionary.columns table making sure these variables exist in those tables.

 

tab1.id  
tab1.type
tab1.region
tab2.ph1
tab2.Fore1
tab2.Ext1
tab3.ph2
tab3.Fore2
tab3.Ext2
tab4.ph3
tab4.Fore3
tab4.Ext3
tab5.name
tab5.co
tab5.addr
tab5.city
tab5.email

Thank you

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

I am not quite sure I follow.

 

So is your question how to read that attached files to get the information that is follows the files in such a way that you can have the tab1.id and similar values? And then look those up in dictionary.columns?

To search dictionary.columns we would need a library name.

 

And does your actual source file have the line numbers preceeding?

 

This reads the data, gets the likely tablename.variable strings and parses out the memname(table) and name (variable).

data possiblevars ;
	length text $400 ;
	infile datalines length = l ;
	input @1 text $varying400. l ;
   length str $ 42.;

   do i=1 to (countw(text,' :,'));
      str = scan(text,i,' :,');
      if index(str,'.')>0 then do;
         memname=upcase(scan(str,1,'.'));
         name = upcase(scan(str,2,'.'));
         output;
      end;
   end;
   drop i text;

datalines;
PROJECT TEST 1
FILES:
tab1
tab2
tab3
tab4
tab5

ID:  tab1.id  Type:  tab1.type   REGION:  tab1.region

Cell:  tab2.ph1  tab2.Fore1 tab2.Ext1
Home:  tab3.ph2  tab3.Fore2 tab3.Ext2
Other: tab4.ph3  tab4.Fore3 tab4.Ext3
tab5.name
tab5.co
tab5.addr
tab5.city
tab5.email 
;
run;




 

 

View solution in original post

2 REPLIES 2
ballardw
Super User

I am not quite sure I follow.

 

So is your question how to read that attached files to get the information that is follows the files in such a way that you can have the tab1.id and similar values? And then look those up in dictionary.columns?

To search dictionary.columns we would need a library name.

 

And does your actual source file have the line numbers preceeding?

 

This reads the data, gets the likely tablename.variable strings and parses out the memname(table) and name (variable).

data possiblevars ;
	length text $400 ;
	infile datalines length = l ;
	input @1 text $varying400. l ;
   length str $ 42.;

   do i=1 to (countw(text,' :,'));
      str = scan(text,i,' :,');
      if index(str,'.')>0 then do;
         memname=upcase(scan(str,1,'.'));
         name = upcase(scan(str,2,'.'));
         output;
      end;
   end;
   drop i text;

datalines;
PROJECT TEST 1
FILES:
tab1
tab2
tab3
tab4
tab5

ID:  tab1.id  Type:  tab1.type   REGION:  tab1.region

Cell:  tab2.ph1  tab2.Fore1 tab2.Ext1
Home:  tab3.ph2  tab3.Fore2 tab3.Ext2
Other: tab4.ph3  tab4.Fore3 tab4.Ext3
tab5.name
tab5.co
tab5.addr
tab5.city
tab5.email 
;
run;




 

 

jerry898969
Pyrite | Level 9

Hi Ballard,

 

Thank you so much. Your data steps returns the correct 17 variables.   Sorry for any confusion.

 

I justed wanted to read in that file and get the list of tables which are listed under "FILES" and then using those table names search the reset of the document for each table.variable combination. 

 

So tab1 would be first and I want it to return the n variables that are associated with tab1.

 

In this example it is these 3 like your code returns.

tab1.id  
tab1.type
tab1.region

 

Thanks again this is a big help.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 861 views
  • 1 like
  • 2 in conversation