- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi everyone,
I was given a dataset with first name and last name variables and asked to concatonate these and then order the full name alphabetically. I have managed to concatonate the names but I am struggling to order the full name. Hopefully the example below will clear things up.
Data I had:
fname sname
Jim Smith
Tom Bell
Data currently:
fullname
JimSmith
TomBell
Data I need:
fullname2
hiiJmmSt
BellmoT
Apologies if this has been answered but I have not found anything on my searches!
Thanks in advance
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the advice, it is much appreciated!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Data currently;
input fullname $20.;
datalines;
JimSmith
TomBell
;
data want;
set currently;
array t(50) $ _temporary_;
call missing(of t(*));
do _n_=1 to length(strip(fullname));
t(_n_)=lowcase(char(fullname,_n_));
end;call sortc(of t(*));newname=cats(of t(*));run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Define an array, put your chars into the array elements and then use the CALL SORTC.
What is the use case for this data manipulation?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data matching with another work data set.
Hmm. "Tim Mott" and "Tom Mitt" will not match in name form, but will match in sorted anagram form. That could be interesting.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@skibur I think your question has been answered. Please close the thread. It's way too simple question
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Using compress with the whole alphabet at the first argument, your text as second argument, and 'k' modifier to keep letters for the alphabet as they appear.
data test;
input fname $ sname $ ;
datalines;
John Smith
Tom Bell
;
data test;
set test;
fullname = cats(fname, sname);
fullname2 = compress('AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz', fullname, 'k');
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@bodowd wrote:
Using compress with the whole alphabet at the first argument, your text as second argument, and 'k' modifier to keep letters for the alphabet as they appear.
data test; input fname $ sname $ ; datalines; John Smith Tom Bell ; data test; set test; fullname = cats(fname, sname); fullname2 = compress('AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz', fullname, 'k'); run;
@bodowd : Very nice, because you can trivially change the desired order of characters in the result:
data test;
set test;
fullname = cats(fname, sname);
fullname2 = compress('AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz', fullname, 'k');
fullname3 = compress('ZzYyXxWwVvUuTtSsRrQqPpOoNnMmLlKkJjIiHhGgFfEeDdCcBbAa', fullname, 'k');
run;
But be careful - In the absence of a LENGTH declaration, the resulting FULLNAME2 and 3 lengths are determined by the first argument of the compress function - i.e. length 52 in this case.
Also, letters that appear more than once in the original, appear only once in the result.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------