Hello, I am just learning SAS and so far have not found a way to dynamically create variables based on a count of a character in a string. I am parsing out a string from Active Directory (for example: CN=N123456,OU=Users,OU=Admins,DC=D01,DC=com). I have successfully created collumns for each item following a comma, but it is not dynamic. I have been running a proc sql query to find the Active Directory string with the most commas, and then creating an array based on that number.
In the file I am currently working with, the largest number of commas is 7, so I need 8 columns. I have been creating varialbes like this:
array parsed_vars(8)$50 VAR1-VAR8;
I want to store the number of collumns I need based on my comma count and pass that into the array, something like this:
array parsed_vars(nmbrvar)$50 VAR1-VAR||nmbrvar;
The log shows this:
ERROR: Missing numeric suffix on a numbered variable list (OU1-OU).
Is there a way to do this? I am working in SAS Enterprise Guide 5.1.
Thanks.
Thanks for the help! I have it working now with the code below.
proc sql;
select max(count(distinguishedname,","))+1 into : commacount from have;
quit;
%let commacount = &commacount;
data something;
set have;
array parsed_vars{&commacount}$50 VAR1-VAR&commacount;
run;
In your proc sql, use the into: clause to store your result into a macro variable.
Like
proc sql noprint;
select max(count) into : maxcount from ....;
quit;
You can then use maxcount in further coding:
data something;
set have;
array vars {&maxcount} var1-var&maxcount;
.....
run;
Thanks for the response! I was missing the into: clause. It seems to be getting a little further, but I am still getting the error about missing the numeric suffix.
Code:
proc sql;
select max(count(distinguishedname,","))+1 into : commacount from have;
quit;
data something;
set have;
array parsed_vars{&commacount}$50 VAR1-VAR&commacount;
run;
28 data something;
29 set have;
30 array parsed_vars{&commacount}$50 VAR1-VAR&commacount;
NOTE: Line generated by the macro variable "COMMACOUNT".
30 VAR 9
_
22
200
ERROR: Missing numeric suffix on a numbered variable list (VAR1-VAR).
ERROR: Too few variables defined for the dimension(s) specified for the array parsed_vars.
ERROR 22-322: Syntax error, expecting one of the following: a name, (, ;, _ALL_, _CHARACTER_, _CHAR_, _NUMERIC_.
ERROR 200-322: The symbol is not recognized and will be ignored.
You're most of the way there.
The problem is that SQL must perform a numeric to character conversion, to feed the numeric value into &COMMACOUNT. This conversion generates leading blanks in the value of &COMMACOUNT. One simple way to remove those leading blanks would be:
%lead commacount = &commacount;
That would go between the SQL and the DATA step, and should take care of the problem.
Some autocompletion seems to have played a trick on you.
%lead commacount = &commacount;
should be
%let commacount = &commacount;
That's just my brain trying to function before the first cup of coffee. Absolutely, use %LET not %LEAD.
Thanks for the help! I have it working now with the code below.
proc sql;
select max(count(distinguishedname,","))+1 into : commacount from have;
quit;
%let commacount = &commacount;
data something;
set have;
array parsed_vars{&commacount}$50 VAR1-VAR&commacount;
run;
Personally I would normalise your data, i.e. put each data item into a separate observation. If necessary you could then transpose them up or work with it as is.
data temp; str="CN=N123456,OU=Users,OU=Admins,DC=D01,DC=com"; do i=1 to countw(str,","); word=scan(str,i,","); output; end; run; proc transpose data=temp out=t_temp prefix=word; var word; id i; idlabel i; run;
HAVE
Up to 40 obs WORK.HAVE total obs=10
Obs CMADLM
1 aaaa,aaaa,aaaa,aaaa,1stwrd
2 aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa
3 aaaa,aaaa,aaaa
4 aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa
5 aaaa,aaaa
6 aaaa,aaaa,aaaa,aaaa
7 aaaa,aaaa
8 aaaa,aaaa,aaaa,aaaa,aaaa,aaaa
9 aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa
10 aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa,aaaa
WANT
Up to 40 obs WORK.TEMP total obs=10
Obs V1 V2 V3 V4 V5 V6 V7 V8 V9
1 aaaa aaaa aaaa aaaa 1stwrd
2 aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa
3 aaaa aaaa aaaa
4 aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa
5 aaaa aaaa
6 aaaa aaaa aaaa aaaa
7 aaaa aaaa
8 aaaa aaaa aaaa aaaa aaaa aaaa
9 aaaa aaaa aaaa aaaa aaaa aaaa aaaa
10 aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa
* The maximum size the array can be is 656 because 656*50=32800;
* 50 was mentioned in the instructions - but could set higher 4096;
* without performance penaly?;
data interim;
retain cmacnt 0;
set have(in=have);
array vars[*] $50 v1-v656;
do cma=1 to (countc(cmadlm,',')+1);
vars[cma]=scan(cmadlm,cma,',');
end;
if cmacnt < cma-1 then cmacnt=cma-1;
call symputx('cmacnt',put(cmacnt,4.));
run;quit;
data temp;
set interim(keep=v1-v&cmacnt);
run;quit;
Code to create have
data have;
length cmadlm $500;
retain cmadlm '1stwrd';
do str=1 to 10;
too=10*uniform(57371);
do cma=1 to too;
cmadlm=catx(',','aaaa',cmadlm);
end;
output;
cmadlm=' ';
end;
keep cmadlm;
run;quit;
Correction I don't know what I was thinking but 656 is not large enough, it assumes the string has at least 50 bytes between commas.
Oh well, I thinl 16384 is the max.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.