BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
hamza_saspg
Fluorite | Level 6

Hi,

 

I need to extract the variable names from the given macro variable &var (which is not working well using prxposn), then test if they are empty. If it is the case so I need to delete the separator (slash / in this case) and obtain an empty column in the output instead of that separator alone when the variables are empty and the concatenation result should be stored in the variable code.

The below approach is based on prx expressions since I don't have a control on the variable name that will be mentioned in the macro variable &var used as  input (it happens that the variable name contains digits and it should be the ones before and after <>). The concatenation result should be stored in the variable code

 

%let var = [aeterm9] aedecod </@2> aeterm ; /*here aedecod and aeterm are two variables included in the dataset adae, but we could replace them by other ones*/
data want;
set adae ;

patternID1=prxparse("/(\[(\w+)\])?\s*(\w*)/");
/* define a pattern for <Separator1DigitsSeparator2> Variable2 */
patternID2=prxparse("/<([^<>0-9]*)(\d*)([^<>0-9]*)>\s*(\w*)/");
/* define a pattern for matched braces ')' or '(' */
patternID3=prxparse("/<[^a-zA-Z\<\>]*(\(|\))[^a=zA-Z\<\>]*\>/");
do i=1 to 2;
test1= scan("[aeterm9] aedecod </@2> aeterm" , i , '<>' ) ;
test2=prxposn(patternID1,3,test1);
end;

varlist= test2 ;
if test2ne '' and test2 ne '.' then do;

code=cats('strip(put(',test2,',',vformatx(test2),'))');
label=vlabelx(test2);


end;
else do;
code='';
label='';
end;

run;

 

Hope I have clarified enough the problem and thanks in advance for your help.

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Please answer @SASJedi very valid questions because if your source string has a "stable" pattern then things will become much easier and no RegEx will be required.

 

If things aren't that simple then which RegEx will work and how complex it needs to be to identify a SAS variable name without also picking false positives will very much depend on what patterns you have to cover. 

With SAS data step syntax the following would be variables.

first.varname
function(varname)
array_name[varname]
etc.

The RegEx for a string that complies with SAS naming conventions for a variable could look like: 

[_[:alpha:]]\w{0,31}

- one to 32 characters

- First character is a letter or underscore

- 2nd to 32th character is a underscore, letter or digit

 

Here some RegEx that returns the sub-strings you defined as desired.

data test;
  source_string="[aeterm9] aedecod </@2> aeterm";
  length found $32;
  _prxid=prxparse('/(^|[ ])([_[:alpha:]]\w{0,31})($|[ ])/i');
  _start = 1;
  _stop = length(source_string);
  call prxnext(_prxid, _start, _stop, trim(source_string), _pos, _len);
  do while (_pos > 0);
    found = prxposn(_prxid, 2, source_string);
    output;
    call prxnext(_prxid, _start, _stop, trim(source_string), _pos, _len);
  end;
run;

proc print data=test;
run;

Patrick_0-1710852267407.png

 

 

View solution in original post

11 REPLIES 11
yabwon
Onyx | Level 15

Maybe try BasePlus package and the %getVars() macro.

 

Example 3:

  %put *%getVars(sashelp.class, pattern=i|a)*;

  %put *%getVars(sashelp.class, pattern=^w)*;

  %put *%getVars(sashelp.class, pattern=ght$)*;

or Example 4: 

  %put *%getVars(sashelp.class, sep=+, pattern=^(w|h)|x$, varRange=_numeric_)*;

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



yabwon
Onyx | Level 15

The golden rule says: "if you can solve it with or without regular expressions, solve it without":

%let var = [aeterm9] aedecod </@2> aeterm ;

data _null_;
str=symget('var');
length v $ 32;
do i = 1 to countw(str, " ");
  v = compress(scan(str,i, " "),"_","KAD");
  call symputX(cats("V_",i),v, "G");
end;
run;

%put &=V_1. &=V_2. &=V_3. &=V_4.;

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



hamza_saspg
Fluorite | Level 6

Hi @yabwon I appreciate you way you simplify things! It works well when it's outside a macro structure, so I would prefer an approach based on prx functions to synchronize with my existing structure, Thanks! 🙂

yabwon
Onyx | Level 15

Making it a macro is almost trivial:

%let var = [aeterm9] aedecod </@2> aeterm ;

%macro cutIntoParts(varName);
%local i;
%do i = 1 %to %sysfunc(countw(%superq(&varName.), %str( )));
  %global V_&i.;
  %let V_&i. = %sysfunc(compress(%scan(%superq(&varName.), &i., %str( )),_,KAD));
%end;
%mend cutIntoParts;


%cutIntoParts(var)

%put &=V_1. &=V_2. &=V_3. &=V_4.;

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



SASJedi
SAS Super FREQ
  1. Is the text in the macro variable consistently laid out like this, and you just want to extract varname1 and varname2? Or does the layout vary?
    [some text] varname1 <@n> varname2
  2. You provided a sample input value of "[aeterm9] aedecod </@2> aeterm". What would the values of test1 and test2 be if your code was working the way you wanted it to work?
  3. Please provide a simple sample of the dataset adae. Your use of the functions VFORMATX and VLABLEX doesn't make without this for context.

 

 

  

Check out my Jedi SAS Tricks for SAS Users
hamza_saspg
Fluorite | Level 6

1- Yes the goal is to extract the variable name from [some text] varname1 <@nvarname2 under the hypothesis that we could have multiple variable names with some diversity in the characters composing those names.

 

2- The expected from  test1 and test2 is to contain the values + separator : / as an example : aeterm1/aedecod1 or /aeterm2/aedecod2/ and the goal is to exclude the separator when the variables are empty and avoid the single separator from displaying in the report in that case

 

3- The dataset could adae or another one and we need to concatenate two or more variables with the separator (any symbol that will be inserted between variable values: </> <#> <$> <-> <> <(> <)> <!> <|> )

 

              

Patrick
Opal | Level 21

Please answer @SASJedi very valid questions because if your source string has a "stable" pattern then things will become much easier and no RegEx will be required.

 

If things aren't that simple then which RegEx will work and how complex it needs to be to identify a SAS variable name without also picking false positives will very much depend on what patterns you have to cover. 

With SAS data step syntax the following would be variables.

first.varname
function(varname)
array_name[varname]
etc.

The RegEx for a string that complies with SAS naming conventions for a variable could look like: 

[_[:alpha:]]\w{0,31}

- one to 32 characters

- First character is a letter or underscore

- 2nd to 32th character is a underscore, letter or digit

 

Here some RegEx that returns the sub-strings you defined as desired.

data test;
  source_string="[aeterm9] aedecod </@2> aeterm";
  length found $32;
  _prxid=prxparse('/(^|[ ])([_[:alpha:]]\w{0,31})($|[ ])/i');
  _start = 1;
  _stop = length(source_string);
  call prxnext(_prxid, _start, _stop, trim(source_string), _pos, _len);
  do while (_pos > 0);
    found = prxposn(_prxid, 2, source_string);
    output;
    call prxnext(_prxid, _start, _stop, trim(source_string), _pos, _len);
  end;
run;

proc print data=test;
run;

Patrick_0-1710852267407.png

 

 

hamza_saspg
Fluorite | Level 6

@Patrick it seems like your approach is matching exactly what I need, then I will be able to build on it and exclude the separator from the concatenation when one or multiple variables are empty. Thanks!

ballardw
Super User

@hamza_saspg wrote:

Hi,

 

I need to extract the variable names from the given macro variable &var (which is not working well using prxposn), then test if they are empty.


What does "test if they are empty" mean? I don't see any code actually testing values of variables.

 

SAS has MISSING, and can test for that.

I would say that instead of bothering with such a macro use tools SAS has already provided such as Proc Freq with the NLEVELS option.

 

ods select nlevels;
ods output nlevels=myleveldataset; /* if you want a data set*/
proc freq data=yourdataset nlevels;
run;

For example running that on SASHELP.CLASS will show this content in the level dataset. If the NNonMissLevels is not equal to zero then that variable has at least one non-missing value.

Nmiss of zero means that variable has no missing values at all.

                                     NNon
Table                    NMiss        Miss
Var        NLevels      Levels      Levels

Name            19           0          19
Sex              2           0           2
Age              6           0           6
Height          17           0          17
Weight          15           0          15

The Nmiss levels allows for the use of the special missing values so there is a potential of values up to 27: . .A through .Z and ._

 

hamza_saspg
Fluorite | Level 6

@ballardw I see your point but the expected is to concatenate two empty variables and be able to exclude the separator in case one or multiple variables are empty, so I need an approach considering an existing structure based on prx functions and macros

ballardw
Super User

@hamza_saspg wrote:

@ballardw I see your point but the expected is to concatenate two empty variables and be able to exclude the separator in case one or multiple variables are empty, so I need an approach considering an existing structure based on prx functions and macros


Perhaps some examples of what you have and what you expect for output.

The CATX function will insert a string as you describe if I understand the requirement.

data example;
  var1 = '';
  var2 = 'sometext';
  var3 = '';
  var4 = 'something else';
  out1 = catx('|',var1,var2);
  out2 = catx('|',var1,var3,var2);
  out3 = catx('|',var1,var3,var2,var4);
  out4 = catx('|',var1,var1,var3);
  out5 = catx('|',var2,var1,var3,var4);
run;

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 851 views
  • 10 likes
  • 5 in conversation