BookmarkSubscribeRSS Feed
TheNovice
Quartz | Level 8

Hi, I have 2 problems with a data set I need to resolve and hoping someone can guide.

 

1. Variable ID = rci\name.lastname . Need to remove rci\ to get name.lastname

 

2. Emp_ID number 

 

If EMP_ID starts with 10  then have to convert to sys ID = 1+ digits after the 0  or in other words remove the 0 after the 1 ( always a 7digit sys ID)

E.g. EMP_ID 10980041 = Sys_ID 1980041.

 

For this i was thinking of just subtracting 900000 from Emp_ID but is there a better way?

 

If the EMP_ID starts with a 11 or 12 or anything different than 10, add a 1 in front of the EMP_ID

E.g. EMP_ID 12020789 = Sys_ID 112020789 (Always 9 digit Sys_ID).

 

For this i would add 10,000,000 but is there a better way ?

11 REPLIES 11
Reeza
Super User
For #1 use SCAN() with \ as the delimiter.

For #2 is it a character or numeric variable?
Reeza
Super User

Actually, for #2, use SUBSTR() for character variable or SUBSTRN() for a numeric variable and you can use CATT() to concatenate the 1 needed. 

 

Those functions, with an IF/ELSE are enough to solve these problems. 

 


@TheNovice wrote:

Hi, I have 2 problems with a data set I need to resolve and hoping someone can guide.

 

1. Variable ID = rci\name.lastname . Need to remove rci\ to get name.lastname

 

2. Emp_ID number 

 

If EMP_ID starts with 10  then have to convert to sys ID = 1+ digits after the 0  or in other words remove the 0 after the 1 ( always a 7digit sys ID)

E.g. EMP_ID 10980041 = Sys_ID 1980041.

 

For this i was thinking of just subtracting 900000 from Emp_ID but is there a better way?

 

If the EMP_ID starts with a 11 or 12 or anything different than 10, add a 1 in front of the EMP_ID

E.g. EMP_ID 12020789 = Sys_ID 112020789 (Always 9 digit Sys_ID).

 

For this i would add 10,000,000 but is there a better way ?


 

TheNovice
Quartz | Level 8

So, the problem is that only some ID begin with RCI\. How would i use an If function in tandem to make the change if that is the case?

TheNovice
Quartz | Level 8

I did this:

 

length zip $4;
zip= left('lan i.d'n);
if zip in ('RCI\','rci\','Rci\') then d = substr('lan i.d'n,5,50);


Still stuck on the number problem so I will just add and subtract as I stated in my earlier post

Reeza
Super User

@TheNovice wrote:

So, the problem is that only some ID begin with RCI\. How would i use an If function in tandem to make the change if that is the case?


Please post more representative data and show your full code. 

 

LEFT() doesn't do whatever you think it's doing in your code, it only aligns a variable which has nothing to do with what you're trying to do here. 

 

I would use FIND/INDEX to search for / to see if you want to use SCAN(). 

 

No idea what you tried with SUBSTR(), but if you show your work, we can comment on it. 

TheNovice
Quartz | Level 8

Hi all,

 

Apologies for the massive delay... the final code was:

 

Data Need;
set gl;
length zip $4;
zip= left('lan i.d'n);
if zip in ('RCI\','rci\','Rci\') then d = substr('lan i.d'n,5,50);
If 'Employee ID'n < 12000000 then V21_ID = 'Employee ID'n - 9000000;
else V21_ID = 'Employee ID'n + 100000000;
run;

Tom
Super User Tom
Super User

@TheNovice wrote:

Hi all,

 

Apologies for the massive delay... the final code was:

 

Data Need;
set gl;
length zip $4;
zip= left('lan i.d'n);
if zip in ('RCI\','rci\','Rci\') then d = substr('lan i.d'n,5,50);
If 'Employee ID'n < 12000000 then V21_ID = 'Employee ID'n - 9000000;
else V21_ID = 'Employee ID'n + 100000000;
run;


You can use the colon modifier to test the beginning of a string.

if left('lan i.d'n) in: ('RCI\','rci\','Rci\') then d = substr('lan i.d'n,5,50);

Or

if upcase(left('lan i.d'n)) =: 'RCI\' then d = substr('lan i.d'n,5,50);

 

Why are you storing ID variables as numbers instead of strings?  

Why do you have such goofy names for your variables? Why not use normal names that do not contain spaces or periods?  Then your code will be a lot easier to type and read. You can use the variable LABEL to store more descriptive information about the variable.  

 

TheNovice
Quartz | Level 8

 

The ID variables are numeric because that's how they are stored in other tables that I need to pull data from. Are you suggesting that I convert ID variable to character to then modify?

 

The goofy names are not my doing. I get a flat file from someone and I have to convert the Employee ID to another ID... What would you suggest? I have never used the LABEL statement but i will look into it

Reeza
Super User
When you import your data ensure you have set the following first. Then you'll get your data imported correctly. Given that it's a flat file you have total control over how its imported, and whether a variable is character or numeric. It's actually very bad practice to use PROC IMPORT for a repeatable process because you can't guarantee the imports will be the exact same across changes in SAS or file types. It's a very easy way to introduce errors into your process.

options validvarname=v7;
Tom
Super User Tom
Super User

If you need it as number to join you might be stuck with it.  SAS only has two types of variables. Fixed length strings or floating point numbers.  Storing an ID as a number is confusing since you will never want to take the mean or sum the values.  Also floating point numbers can only store a maximum of 16 decimal digits exactly.  So you can run into trouble if your ID values get too long.

 

If you wrote the program to read in the data then just use normal names for your variables in your program.

 

If you used PROC IMPORT to guess what names to use by reading a text file or a spreadsheet you might get much better names if you set the VALIDVARNAME option to V7 instead of ANY.  Then PROC IMPORT will convert report headers like "EMPLOYEE ID" that use invalid names into a name like EMPLOYEE_ID instead.

TheNovice
Quartz | Level 8

Thanks you so much explaining. I will test immediately!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1257 views
  • 1 like
  • 3 in conversation