I need to create something similar to the following for hundreds of records.
Have | Want | |||
OBS | ID | OBS | ID | |
A | A | SL00001 | ||
B | B | SL00002 | ||
C | C | SL00003 | ||
D | D | SL00004 | ||
E | E | SL00005 | ||
F | F | SL00006 | ||
G | G | SL00007 | ||
H | H | SL00008 | ||
I | I | SL00009 | ||
J | J | SL00010 | ||
K | K | SL00011 |
I need to create a unique 7 digit code for each record starting with SL00001, SL00002, etcetera up to the number of records.
I have been trying things like below to no avail.
PROC SQL;
SELECT COUNT(OBS) AS RecCount
From WORK.DataSample;
QUIT;
DATA WithIDs;
SET WORK.DataSample;
OBS = OBS;
j = N(OBS);
DO DCL INT i;
i = 1 TO j BY 1;
ID = CATS('SL',put(i,z5.));
OUTPUT;
END;
RUN;
PROC PRINT DATA=WithIDs;
RUN;
Any help would be appreciated.
data have;
input OBS $;
cards;
A
B
C
D
E
F
G
H
I
J
K
;
proc sql;
create table want as
select *,cats('SL',put(monotonic(),z5.)) as ID
from have;
quit;
DATA WithIDs;
SET WORK.DataSample;
ID = CATS('SL',put( _n_ ,z5.));
RUN;
Maybe just that instead?
@WB1_2018 wrote:
I need to create something similar to the following for hundreds of records.
Have Want OBS ID OBS ID A A SL00001 B B SL00002 C C SL00003 D D SL00004 E E SL00005 F F SL00006 G G SL00007 H H SL00008 I I SL00009 J J SL00010 K K SL00011
I need to create a unique 7 digit code for each record starting with SL00001, SL00002, etcetera up to the number of records.
I have been trying things like below to no avail.
PROC SQL;
SELECT COUNT(OBS) AS RecCount
From WORK.DataSample;
QUIT;
DATA WithIDs;
SET WORK.DataSample;
OBS = OBS;
j = N(OBS);
DO DCL INT i;
i = 1 TO j BY 1;
ID = CATS('SL',put(i,z5.));
OUTPUT;
END;
RUN;
PROC PRINT DATA=WithIDs;
RUN;
Any help would be appreciated.
SQL doesn't have a real strong connection to sequence, especially if you expect the first value of a current data set to get a specific value.
A data step however does, in fact there is an automatic variable _n_ that is iterated each time the data set does. So this should do what you are requesting. Using the put function with the Z5 format outputs a string of 5 characters with leading 0.
data want; set have; length id $ 7; id = cats('SL',put(_n_,z5.)); run;
@WB1_2018 wrote:
I am trying this:
DATA want;
SET have;
LENGTH ID $ 7;
ID = CATS('SL',put( _n_ ,z5.));
RUN;
But get warning errors below. Is it possible that _n_ is wider than 2 digits?
WARNING: Length of character variable ID has already been set.
Use the LENGTH statement as the very first statement in the DATA STEP to declare the length of a character variable.
WARNING: In a call to the CATS function, the buffer allocated for the result was not long enough to contain the concatenation of
all the arguments. The correct result would contain 7 characters, but the actual result might either be truncated to 1
character(s) or be completely blank, depending on the calling environment. The following note indicates the left-most
argument that caused truncation.
That almost certainly means that you already have an ID variable that exists in your data set and it is too short to hold 7 characters. Placing the Length ID $ 7. before the SET statement would fix that, as the log says.
We did not mention that because your "picture" of data does not carry any characteristics or if the variable ID even already exists.
Thank you for that great insight! That was the issue (user!). LOL.
data have;
input OBS $;
cards;
A
B
C
D
E
F
G
H
I
J
K
;
proc sql;
create table want as
select *,cats('SL',put(monotonic(),z5.)) as ID
from have;
quit;
Well, that was fast! This is my first time using SAS Community. This appears to work fine though I need to research and understand what monotonic is. Thank you this is really cool!
And be aware that the MONOTONIC is undocumented and has no promises of consistent behavior or even presence from release to release of SAS.
Sir @ballardw I agree however here at my bank where I work, we work with 9.3,9.4 etc and it works fine.
Sir @ChrisHemedinger any thoughts from sas inc? Are you opposed to it? Paul D uses a lot, I am going with that confidence as I trust him wholeheartedly
I went with monotonic since I have not been able to get _n_ to work. I would use it, if I could.
Hi @WB1_2018 In most cases monotonic() does work though it's not documented like @ballardw mentioned. Yes, so that does require attention. Let me admit, my true experience in the real world is limited unlike many veterans here whose experience equals my age. lol
The reason I posted monotonic() is because -
1. Many whom I work with are all SQL users who would rather prefer SQL spaghetti code as opposed to any classy SAS syntax
2. We use 9.3,9.4 in our various departments i.e credit risk, marketing, operations etc. Monotonic() transcends across like bread and butter. So that gave me the confidence to offer
3. I have seen my Guru Paul D @ hashman offering more than couple of times here, and that gives me even more confidence.
With all the above being mentioned, it's good to be aware just in case!
Here's the thing about monotonic(). It's not documented, but it is well-known in SAS lore as a way to simulate a "record number" function in PROC SQL.
However, it's not an SQL function. This works too:
data a;
set sashelp.class;
recordnum = monotonic();
run;
But you don't need this in a DATA step usually, because you have the _N_ automatic variable. (Although _N_ isn't really "obs number" -- it's the number of times you've been through the implicit loop.)
Because it's a SAS function and not specific to SQL...you can technically use the function in PROC SQL. But the results are nondeterministic (that is...not predictable) in some SQL contexts where PROC SQL might optimize your query. In general, people find it works okay for the simple example that the OP asked about here. But it's not guaranteed, and would be more likely to break down in a more complex query with joins or passthrough to a database. "Observation numbers" are a concept that makes sense in DATA step and an analytical data context where "order matters" - but they don't make so much sense in a relational data model.
If your goal is a unique key for your record, and not an ordinal record number, consider using the UUIDGEN function.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.