data newdata (keep=word1-word7 name seqno);
retain seqno name;
array myvar (*)$ word1-word7;
set sashelp.class ;
by name notsorted;
do seqno=1 to 7;
do i= 1 to 7;
myvar(i)=substr(Name,i,1);
if last.name then output;
output;
end;
end;
proc print noobs; run;
how to get complete split letters names output like below
seqno | name | word1 | word2 | word3 | word4 | word5 | word6 | word7 |
1 | Alfred | A | l | f | r | e | d | |
2 | Alice | A | l | i | c | e | ||
3 | Barbara | B | a | r | b | a | r | a |
Why do you have two OUTPUT statements? Why do you use BY when there is only one observation per name anyway?
To create a sequential count variable, use RETAIN and a sum statement, you do not need a DO loop for this.
And use proper indentation when writing code, as all the people here are constantly showing you.
Hints: do not use an explicit OUTPUT, the DATA step will do this for you. Use one DO loop to scan through the name.
data newdata (keep=Letter_1-Letter_7 name);
retain name;
array myvar (*)$ Letter_1-Letter_7;
set sashelp.class;
do i=1 to dim(myvar);
myvar(i)=upcase(substr(name, i, 1));
end;
proc print noobs;
run;
Thank you Kurt,
Study the documentation of the RETAIN Statement, and then decide if it is needed here, or if it does not serve any purpose at all.
Also look at the length of your newly created variables, you may want to change that, as you only store one character each.
if I were you I would take a bit different approach:
Data;
data have;
set sashelp.class(keep=name) end=E;
output;
if E then
do;
name='空手道';
output;
name='Żółć';
output;
end;
run;
Code:
proc sort data=have out=want;
by name;
run;
data want;
set want;
by name;
do i=1 to KLENGTH(name);
word = KSUBSTR(name,i,1);
output;
end;
run;
proc print;run;
proc transpose data=want out=want(drop=_name_) prefix=word;
by name;
var word;
run;
proc print;run;
Comments:
- Use klength() and ksubstr() to properly handle multi byte characters (which are quite common in names).
- Use data step to cut letters one by one and then use Proc Transpose, with that approach you won't have to assume maximum length of a name imagine you will have "Broomhilda" or "Małgorzata", or this man, which are much longer than 7 letters.
Bart
Something like below should do.
proc sql noprint;
select length into :len trimmed
from dictionary.columns
where libname='SASHELP' and memname='CLASS' and upcase(name)="NAME"
;
quit;
data newdata (keep=name letter_:);
set sashelp.class;
array letter_ (&len) $1;
do _i=1 to length(name);
letter_[_i]=substr(name,_i,1);
end;
run;
proc print data=newdata noobs;
run;
Thank you Patrick for you solution how can get only maximum length names in the above class dataset
@BrahmanandaRao wrote:
Thank you Patrick for you solution how can get only maximum length names in the above class dataset
Please explain in more detail.
Do you want to display the maximun encountered length of a name at the end of the step, or do you want to only process the names having this maximum length, or even subset the dataset to these names?
See Maxim 42.
Hi kurt
i want get which name has maximum length those names only displays
In a first step, determine the maximum length (e.g. with PROC SQL SELECT INTO), then use this value in a WHERE.
proc sql ;
select max(length(Name)) as maxlen into :len separated by ' '
from newdata
having name=max(name);
quit;
MAX of a character value gets the lexically "highest" (e.g. "Z" is higher than "A"), so I ask what you want to achieve with your HAVING clause?
data newdata (keep=Letter_1-Letter_7 name sex);
retain name;
array myvar (*)$ Letter_1-Letter_7;
set sashelp.class;
do i=1 to dim(myvar);
myvar(i)=upcase(substr(name, i, 1));
end;
proc print noobs;
run;
/*Maximum length string in the newdataset */
PROC SQL;
SELECT Name ,letter_1,letter_2,letter_3,letter_4,letter_5,letter_6,letter_7
FROM NEWDATA
WHERE LENGTH(Name) = (SELECT MAX(LENGTH(Name)) FROM NEWDATA);
QUIT;
Please READ my posts, and read them THOROUGHLY. I said you have to determine the maximum length in a FIRST step, so you can then use it in the WHERE condition of the DATA step that splits the name.
/* maximum length names output using datastep*/
data newdata (keep=Letter_1-Letter_7 name sex );
retain name;
array myvar (*)$ Letter_1-Letter_7;
set sashelp.class;
do i=1 to dim(myvar);
myvar(i)=upcase(substr(name, i, 1));
where max(length(name))=7 ;
end;
proc print noobs;
run;
/*max and min names length using proc sql*/
PROC SQL;
SELECT name,sex,letter_1,letter_2,letter_3,letter_4,letter_5,letter_6,letter_7,MAX(LENGTH(Name)) AS MaxLengthFemale
FROM newdata
WHERE Sex = 'F';
SELECT name,sex,letter_1,letter_2,letter_3,letter_4,letter_5,letter_6,letter_7,MIN(LENGTH(Name)) AS MinLengthMale
FROM newdata
WHERE Sex = 'M';
QUIT;
1.How to get min and max names length output in datastep
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.