Hi, I have 2 problems with a data set I need to resolve and hoping someone can guide.
1. Variable ID = rci\name.lastname . Need to remove rci\ to get name.lastname
2. Emp_ID number
If EMP_ID starts with 10 then have to convert to sys ID = 1+ digits after the 0 or in other words remove the 0 after the 1 ( always a 7digit sys ID)
E.g. EMP_ID 10980041 = Sys_ID 1980041.
For this i was thinking of just subtracting 900000 from Emp_ID but is there a better way?
If the EMP_ID starts with a 11 or 12 or anything different than 10, add a 1 in front of the EMP_ID
E.g. EMP_ID 12020789 = Sys_ID 112020789 (Always 9 digit Sys_ID).
For this i would add 10,000,000 but is there a better way ?
Actually, for #2, use SUBSTR() for character variable or SUBSTRN() for a numeric variable and you can use CATT() to concatenate the 1 needed.
Those functions, with an IF/ELSE are enough to solve these problems.
@TheNovice wrote:
Hi, I have 2 problems with a data set I need to resolve and hoping someone can guide.
1. Variable ID = rci\name.lastname . Need to remove rci\ to get name.lastname
2. Emp_ID number
If EMP_ID starts with 10 then have to convert to sys ID = 1+ digits after the 0 or in other words remove the 0 after the 1 ( always a 7digit sys ID)
E.g. EMP_ID 10980041 = Sys_ID 1980041.
For this i was thinking of just subtracting 900000 from Emp_ID but is there a better way?
If the EMP_ID starts with a 11 or 12 or anything different than 10, add a 1 in front of the EMP_ID
E.g. EMP_ID 12020789 = Sys_ID 112020789 (Always 9 digit Sys_ID).
For this i would add 10,000,000 but is there a better way ?
So, the problem is that only some ID begin with RCI\. How would i use an If function in tandem to make the change if that is the case?
I did this:
length zip $4;
zip= left('lan i.d'n);
if zip in ('RCI\','rci\','Rci\') then d = substr('lan i.d'n,5,50);
Still stuck on the number problem so I will just add and subtract as I stated in my earlier post
@TheNovice wrote:
So, the problem is that only some ID begin with RCI\. How would i use an If function in tandem to make the change if that is the case?
Please post more representative data and show your full code.
LEFT() doesn't do whatever you think it's doing in your code, it only aligns a variable which has nothing to do with what you're trying to do here.
I would use FIND/INDEX to search for / to see if you want to use SCAN().
No idea what you tried with SUBSTR(), but if you show your work, we can comment on it.
Hi all,
Apologies for the massive delay... the final code was:
Data Need;
set gl;
length zip $4;
zip= left('lan i.d'n);
if zip in ('RCI\','rci\','Rci\') then d = substr('lan i.d'n,5,50);
If 'Employee ID'n < 12000000 then V21_ID = 'Employee ID'n - 9000000;
else V21_ID = 'Employee ID'n + 100000000;
run;
@TheNovice wrote:
Hi all,
Apologies for the massive delay... the final code was:
Data Need;
set gl;
length zip $4;
zip= left('lan i.d'n);
if zip in ('RCI\','rci\','Rci\') then d = substr('lan i.d'n,5,50);
If 'Employee ID'n < 12000000 then V21_ID = 'Employee ID'n - 9000000;
else V21_ID = 'Employee ID'n + 100000000;
run;
You can use the colon modifier to test the beginning of a string.
if left('lan i.d'n) in: ('RCI\','rci\','Rci\') then d = substr('lan i.d'n,5,50);
Or
if upcase(left('lan i.d'n)) =: 'RCI\' then d = substr('lan i.d'n,5,50);
Why are you storing ID variables as numbers instead of strings?
Why do you have such goofy names for your variables? Why not use normal names that do not contain spaces or periods? Then your code will be a lot easier to type and read. You can use the variable LABEL to store more descriptive information about the variable.
The ID variables are numeric because that's how they are stored in other tables that I need to pull data from. Are you suggesting that I convert ID variable to character to then modify?
The goofy names are not my doing. I get a flat file from someone and I have to convert the Employee ID to another ID... What would you suggest? I have never used the LABEL statement but i will look into it
If you need it as number to join you might be stuck with it. SAS only has two types of variables. Fixed length strings or floating point numbers. Storing an ID as a number is confusing since you will never want to take the mean or sum the values. Also floating point numbers can only store a maximum of 16 decimal digits exactly. So you can run into trouble if your ID values get too long.
If you wrote the program to read in the data then just use normal names for your variables in your program.
If you used PROC IMPORT to guess what names to use by reading a text file or a spreadsheet you might get much better names if you set the VALIDVARNAME option to V7 instead of ANY. Then PROC IMPORT will convert report headers like "EMPLOYEE ID" that use invalid names into a name like EMPLOYEE_ID instead.
Thanks you so much explaining. I will test immediately!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.