Hello, I am just learning SAS and so far have not found a way to dynamically create variables based on a count of a character in a string. I am parsing out a string from Active Directory (for example: CN=N123456,OU=Users,OU=Admins,DC=D01,DC=com). I have successfully created collumns for each item following a comma, but it is not dynamic. I have been running a proc sql query to find the Active Directory string with the most commas, and then creating an array based on that number.
In the file I am currently working with, the largest number of commas is 7, so I need 8 columns. I have been creating varialbes like this:
array parsed_vars(8)$50 VAR1-VAR8;
I want to store the number of collumns I need based on my comma count and pass that into the array, something like this:
array parsed_vars(nmbrvar)$50 VAR1-VAR||nmbrvar;
The log shows this:
ERROR: Missing numeric suffix on a numbered variable list (OU1-OU).
Is there a way to do this? I am working in SAS Enterprise Guide 5.1.
Thanks.
Thanks for the help! I have it working now with the code below.
proc sql;
select max(count(distinguishedname,","))+1 into : commacount from have;
quit;
%let commacount = &commacount;
data something;
set have;
array parsed_vars{&commacount}$50 VAR1-VAR&commacount;
run;
In your proc sql, use the into: clause to store your result into a macro variable.
Like
proc sql noprint;
select max(count) into : maxcount from ....;
quit;
You can then use maxcount in further coding:
data something;
set have;
array vars {&maxcount} var1-var&maxcount;
.....
run;
Thanks for the response! I was missing the into: clause. It seems to be getting a little further, but I am still getting the error about missing the numeric suffix.
Code:
proc sql;
select max(count(distinguishedname,","))+1 into : commacount from have;
quit;
data something;
set have;
array parsed_vars{&commacount}$50 VAR1-VAR&commacount;
run;
28 data something;
29 set have;
30 array parsed_vars{&commacount}$50 VAR1-VAR&commacount;
NOTE: Line generated by the macro variable "COMMACOUNT".
30 VAR 9
_
22
200
ERROR: Missing numeric suffix on a numbered variable list (VAR1-VAR).
ERROR: Too few variables defined for the dimension(s) specified for the array parsed_vars.
ERROR 22-322: Syntax error, expecting one of the following: a name, (, ;, _ALL_, _CHARACTER_, _CHAR_, _NUMERIC_.
ERROR 200-322: The symbol is not recognized and will be ignored.
You're most of the way there.
The problem is that SQL must perform a numeric to character conversion, to feed the numeric value into &COMMACOUNT. This conversion generates leading blanks in the value of &COMMACOUNT. One simple way to remove those leading blanks would be:
%lead commacount = &commacount;
That would go between the SQL and the DATA step, and should take care of the problem.
Some autocompletion seems to have played a trick on you.
%lead commacount = &commacount;
should be
%let commacount = &commacount;
That's just my brain trying to function before the first cup of coffee. Absolutely, use %LET not %LEAD.
Thanks for the help! I have it working now with the code below.
proc sql;
select max(count(distinguishedname,","))+1 into : commacount from have;
quit;
%let commacount = &commacount;
data something;
set have;
array parsed_vars{&commacount}$50 VAR1-VAR&commacount;
run;
Personally I would normalise your data, i.e. put each data item into a separate observation. If necessary you could then transpose them up or work with it as is.
data temp; str="CN=N123456,OU=Users,OU=Admins,DC=D01,DC=com"; do i=1 to countw(str,","); word=scan(str,i,","); output; end; run; proc transpose data=temp out=t_temp prefix=word; var word; id i; idlabel i; run;
HAVE
Up to 40 obs WORK.HAVE total obs=10
Obs CMADLM
1 aaaa,aaaa,aaaa,aaaa,1stwrd
2 aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa
3 aaaa,aaaa,aaaa
4 aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa
5 aaaa,aaaa
6 aaaa,aaaa,aaaa,aaaa
7 aaaa,aaaa
8 aaaa,aaaa,aaaa,aaaa,aaaa,aaaa
9 aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa
10 aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa
WANT
Up to 40 obs WORK.TEMP total obs=10
Obs V1 V2 V3 V4 V5 V6 V7 V8 V9
1 aaaa aaaa aaaa aaaa 1stwrd
2 aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa
3 aaaa aaaa aaaa
4 aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa
5 aaaa aaaa
6 aaaa aaaa aaaa aaaa
7 aaaa aaaa
8 aaaa aaaa aaaa aaaa aaaa aaaa
9 aaaa aaaa aaaa aaaa aaaa aaaa aaaa
10 aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa
* The maximum size the array can be is 656 because 656*50=32800;
* 50 was mentioned in the instructions - but could set higher 4096;
* without performance penaly?;
data interim;
retain cmacnt 0;
set have(in=have);
array vars[*] $50 v1-v656;
do cma=1 to (countc(cmadlm,',')+1);
vars[cma]=scan(cmadlm,cma,',');
end;
if cmacnt < cma-1 then cmacnt=cma-1;
call symputx('cmacnt',put(cmacnt,4.));
run;quit;
data temp;
set interim(keep=v1-v&cmacnt);
run;quit;
Code to create have
data have;
length cmadlm $500;
retain cmadlm '1stwrd';
do str=1 to 10;
too=10*uniform(57371);
do cma=1 to too;
cmadlm=catx(',','aaaa',cmadlm);
end;
output;
cmadlm=' ';
end;
keep cmadlm;
run;quit;
Correction I don't know what I was thinking but 656 is not large enough, it assumes the string has at least 50 bytes between commas.
Oh well, I thinl 16384 is the max.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.