data newdata (keep=word1-word7 name seqno); retain seqno name; array myvar (*)$ word1-word7; set sashelp.class ; by name notsorted; do seqno=1 to 7; do i= 1 to 7; myvar(i)=substr(Name,i,1); if last.name then output; output; end; end; proc print noobs; run;
how to get complete split letters names output like below
Why do you have two OUTPUT statements? Why do you use BY when there is only one observation per name anyway?
To create a sequential count variable, use RETAIN and a sum statement, you do not need a DO loop for this.
And use proper indentation when writing code, as all the people here are constantly showing you.
Hints: do not use an explicit OUTPUT, the DATA step will do this for you. Use one DO loop to scan through the name.
Study the documentation of the RETAIN Statement, and then decide if it is needed here, or if it does not serve any purpose at all.
Also look at the length of your newly created variables, you may want to change that, as you only store one character each.
if I were you I would take a bit different approach:
data have; set sashelp.class(keep=name) end=E; output; if E then do; name='空手道'; output; name='Żółć'; output; end; run;
proc sort data=have out=want; by name; run; data want; set want; by name; do i=1 to KLENGTH(name); word = KSUBSTR(name,i,1); output; end; run; proc print;run; proc transpose data=want out=want(drop=_name_) prefix=word; by name; var word; run; proc print;run;
- Use klength() and ksubstr() to properly handle multi byte characters (which are quite common in names).
- Use data step to cut letters one by one and then use Proc Transpose, with that approach you won't have to assume maximum length of a name imagine you will have "Broomhilda" or "Małgorzata", or this man, which are much longer than 7 letters.
Something like below should do.
proc sql noprint; select length into :len trimmed from dictionary.columns where libname='SASHELP' and memname='CLASS' and upcase(name)="NAME" ; quit; data newdata (keep=name letter_:); set sashelp.class; array letter_ (&len) $1; do _i=1 to length(name); letter_[_i]=substr(name,_i,1); end; run; proc print data=newdata noobs; run;
Thank you Patrick for you solution how can get only maximum length names in the above class dataset
Please explain in more detail.
Do you want to display the maximun encountered length of a name at the end of the step, or do you want to only process the names having this maximum length, or even subset the dataset to these names?
See Maxim 42.
In a first step, determine the maximum length (e.g. with PROC SQL SELECT INTO), then use this value in a WHERE.
MAX of a character value gets the lexically "highest" (e.g. "Z" is higher than "A"), so I ask what you want to achieve with your HAVING clause?
data newdata (keep=Letter_1-Letter_7 name sex); retain name; array myvar (*)$ Letter_1-Letter_7; set sashelp.class; do i=1 to dim(myvar); myvar(i)=upcase(substr(name, i, 1)); end; proc print noobs; run; /*Maximum length string in the newdataset */ PROC SQL; SELECT Name ,letter_1,letter_2,letter_3,letter_4,letter_5,letter_6,letter_7 FROM NEWDATA WHERE LENGTH(Name) = (SELECT MAX(LENGTH(Name)) FROM NEWDATA); QUIT;
Please READ my posts, and read them THOROUGHLY. I said you have to determine the maximum length in a FIRST step, so you can then use it in the WHERE condition of the DATA step that splits the name.
/* maximum length names output using datastep*/ data newdata (keep=Letter_1-Letter_7 name sex ); retain name; array myvar (*)$ Letter_1-Letter_7; set sashelp.class; do i=1 to dim(myvar); myvar(i)=upcase(substr(name, i, 1)); where max(length(name))=7 ; end; proc print noobs; run; /*max and min names length using proc sql*/ PROC SQL; SELECT name,sex,letter_1,letter_2,letter_3,letter_4,letter_5,letter_6,letter_7,MAX(LENGTH(Name)) AS MaxLengthFemale FROM newdata WHERE Sex = 'F'; SELECT name,sex,letter_1,letter_2,letter_3,letter_4,letter_5,letter_6,letter_7,MIN(LENGTH(Name)) AS MinLengthMale FROM newdata WHERE Sex = 'M'; QUIT;
1.How to get min and max names length output in datastep
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.