Help using Base SAS procedures

REg:Formats no increase format should be from a,b,c........za,zb,zc

Reply
Contributor
Posts: 66

REg:Formats no increase format should be from a,b,c........za,zb,zc

Hi i am having a dataset having two vaiables no and name of city .i am having 30000 incremantal data of number if no is increasing then
no city
234 mumbai
123 chennai
999 Hyderabad
.
.
.
12456 Delhi
32455 Bhopal

As the city is increasing it should create a format as 1=Mumbai 2=chennai like that.... the no of city names are there.
Super Contributor
Super Contributor
Posts: 3,174

Re: REg:Formats no increase format should be from a,b,c........za,zb,zc

The COLLATE function may work for your needs, or just create a static list of substring character values to reference / assign based on the offset, using SUBSTR and your count variable with a DATA step, as shown below:

data _null_;
retain a2z 'abcdefghijklmnopqrstuvwxyz';
do x=1 to length(a2z);
v = substr(a2z,x,1);
put v= x=;
end;
run;

Scott Barry
SBBWorks, Inc.
Regular Contributor
Posts: 151

Re: REg:Formats no increase format should be from a,b,c........za,zb,zc

Hi.

Try this example. But it isn't clear for me that if there would be 30000 unique ids
then the new_format would be equals 'aaab', 'aaac' and so on. I mean that new_format equals two symbols or it may be more then two symbols.
The example is for two letters.

[pre]

data n; do no=1 to 702 ; output;output;output; end; run;

data t; length f $1. new_format $3. fc $1.; retain fc ' ' f ' ' k 1 j 0 new_format ' '; set n; by no;
if first.no then do;
j=j+1;
if j gt 26 then j=1;
if k gt 26 then k=1;
fc=scan('a b c d e f g h i j k l m n o p q r s t u v w x y z',k,' ');
f=scan('a b c d e f g h i j k l m n o p q r s t u v w x y z',j,' ');
if mod(no,26)=0 and no gt 26 then k=k+1;
if no le 26 then new_format=f; else new_format=fc||f;
end;
keep no new_format;
run;

[/pre]

Regards,
Oleg.
Super User
Posts: 9,681

Re: REg:Formats no increase format should be from a,b,c........za,zb,zc

> now i want the format like this in to new vaiable as
> i am having 30000 observations as no increase it
> should give format like this

> if no is 24566 new_format=za
> if no is 24567 new_format=zb
> if no is 24568 new_format=zc

Hmmmmm...I think format dataset would be a better choice for this kind question.
And as your said, there should be 702 obs(i.e. 26*26+26=702).Counld not be 'no is 24568 ';
And also your 'no' is numeric variable.


[pre]
data format;
length label $2.;
retain fmtname 'fmt' type 'N' ;
do label='a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z' ;
start+1;
output;
end;
do x='a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z' ;
do y='a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z' ;
start+1;
label=cats(x,y);
output;
end;
end;
drop x y;
run;
data n; *To build up your original dataset;
do no=1 to 702 ;
output;
end;
run;
proc format cntlin=format;
run;
data temp;
set n;
new_format=put(no,fmt.);
run;
proc print data=temp(obs=40) noobs;run;
[/pre]



Ksharp
Valued Guide
Posts: 2,175

Re: REg:Formats no increase format should be from a,b,c........za,zb,zc

is this counting in "base 26" (like decimals are base=10, hex is base=16 and binary is base=2)?
But, I'm not sure what letter should take the place of a zero, perhaps it should be base27.
If so, it should be possible to derive the new_format "number" from a decimal number, with something (probably better with logs)
new_format = substr( '0abcdefghijklmnopqrstuvwxyz', 1+int( no/ 27), 1) !! substr( '0abcdefghijklmnopqrstuvwxyz', 1+mod( no, 27), 1) ;

Log base 27 can be derived with the code in the knowledgebase at http://support.sas.com/kb/10/332.html

Will original poster sas_ please explain
1
what the new_format should be for no=27 (the first new_format after " z" )
2
what rules apply for numbers greater than new_format= zz

peterC
Super User
Posts: 9,681

Re: REg:Formats no increase format should be from a,b,c........za,zb,zc

Hi.Peter.
Your 'base 26' would be some right.And something you thought would be further profound.
I also am confused that why this author want to use this mapping.
And your HyperLink,i readed,It is pretty useful. Learned it from you. :-) ;-)
Ask a Question
Discussion stats
  • 5 replies
  • 106 views
  • 0 likes
  • 5 in conversation