Good morning.
I'm using the attached code to document all tables in my code (thanks to others as I'd have never figured this out on my own). However, I'd like the result to be a bit cleaner.
For instance, you can see that a number of entries contains the leading words SEQ and MULTI as well as the trailing words.DATA and .VIEW. I'd like to remove those before the Excel file creation. I've never used this type of code so I'd appreciate any help someone could give me.
proc scaproc;
write;
run;
filename sca_rec "&documentation.\&documentation_name..txt";
filename sca_rec '<<SCAPROC TEXT FILE';
data myinfo;
infile sca_rec truncover;
input @1 scaline $256.;
length type myinfo $100;
keep type myinfo;
retain prxifile prxidata;
if _n_ = 1 then do;
prxifile = prxparse("!\bJOBSPLIT: FILE INPUT SEQ (\b.*\b) \*/!");
prxidata = prxparse("!\bJOBSPLIT: DATASET INPUT (\b.*\b) \*/!");
end;
if prxmatch (prxidata,scaline) > 0 then do;
myinfo = prxposn(prxidata,1,scaline);
type = 'Input File';
output myinfo;
end;
run;
proc export data = myinfo
outfile = '<<outfile path>>'
/* putnames = no;*/
dbms = xls replace;
run;
One approach woud be to re-write the regular expression to capture just the text you want, but that can be complex if you are not familiar with REGEX writing. Another approach would be to process the result using SAS functions. So, in your program, the result of the expression:
myinfo = prxposn(prxidata,1,scaline);
Yields:
MULTI VIEWSDT.FILE1.VIEW
To get rid of all the text before that first space, you could add SCAN:
myinfo = scan(prxposn(prxidata,1,scaline),-1,' ');
Which now yields:
VIEWSDT.FILE1.VIEW
If you want to also remove that last "word" after the period, you re-process the result using SUBSTR and FIND like this:
myinfo = SUBSTR(myinfo,1,FIND(myinfo,'.',-9999)-1);
Which now yields:
VIEWSDT.FILE1
Just be aware that, when the value extracted is something like "SEQ LIST.DATA", the end result will be just "LIST". Is that what you were envisioning?
One approach woud be to re-write the regular expression to capture just the text you want, but that can be complex if you are not familiar with REGEX writing. Another approach would be to process the result using SAS functions. So, in your program, the result of the expression:
myinfo = prxposn(prxidata,1,scaline);
Yields:
MULTI VIEWSDT.FILE1.VIEW
To get rid of all the text before that first space, you could add SCAN:
myinfo = scan(prxposn(prxidata,1,scaline),-1,' ');
Which now yields:
VIEWSDT.FILE1.VIEW
If you want to also remove that last "word" after the period, you re-process the result using SUBSTR and FIND like this:
myinfo = SUBSTR(myinfo,1,FIND(myinfo,'.',-9999)-1);
Which now yields:
VIEWSDT.FILE1
Just be aware that, when the value extracted is something like "SEQ LIST.DATA", the end result will be just "LIST". Is that what you were envisioning?
I think so. At least it gets me way close. Thanks very much for your help.
Well there's a newbie error. I accepted my own post as a solution, sigh. Sorry, that should have been awarded to you.
LOL! No problem - that's fixable, for sure. Glad you're back on track 😊
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.