BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dera
Obsidian | Level 7

Hi,

 

I have a SAS data set with that has two variables

1) a variable that contains the file number

2) a variable contains a text with a clinical diagnosis.

 

The file number is found in its own variable but can also be found at the beginning of the diagnosis text.

I would like to extract only what is after the file number in the diagnosis variable. Sometimes, there is a space bar between the file number and the diagnosis description while sometimes, there no space.

 

Is there a way to tell SAS that when the content of file_number is found in the diagnosis variable, only keep what is after that number?

 

Here is an exemple of what I have :

 

file_numberdiagnosis
123456789file:123456789The patient has…
987654321file:987654321 The patient has…

 

Here is what I want :

 

file_numberdiagnosis
123456789The patient has…
987654321The patient has…

 

Can anyone help me with that?

 

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

OK, this should do it then:

 

diagnosis = substr(diagnosis, length(diagnosis) + index(diagnosis, strip(file_number)));

 

Give it a try just to verify.

 

Better yet:

 

diagnosis = substr(diagnosis, length(file_number) + index(diagnosis, strip(file_number)));

View solution in original post

7 REPLIES 7
Astounding
PROC Star

Lots of people can help with that.  But the programming would change depending on whether FILE_NUMBER is numeric or character.  So let us know that.  Also, is FILE_NUMBER always present or could it contain a missing value?

dera
Obsidian | Level 7

File number is a character and is always present for each observation. Thank you

Astounding
PROC Star

OK, this should do it then:

 

diagnosis = substr(diagnosis, length(diagnosis) + index(diagnosis, strip(file_number)));

 

Give it a try just to verify.

 

Better yet:

 

diagnosis = substr(diagnosis, length(file_number) + index(diagnosis, strip(file_number)));

dera
Obsidian | Level 7

Thank you Astounding!
I used your idea with a small modification and it worked perfectly. Here is what I did :

diagnosis = substr(diagnosis, length(file_number) + index(diagnosis, strip(file_number)));



Cheers

Reeza
Super User

Take a look at the TRANWRD function which will replace a word with specified value, in this case, a blank string. 

 

https://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=p0pgemqcslm9uen1tvr5gcrusgrw.htm...

 

There are several examples at the bottom that illustrate how they can be used.

pink_poodle
Barite | Level 11

Yes, here is one way of approaching this question:

 

*** starting position of filenumber inside the diagnosis;

startFinD = FIND(diagnosis, filenumber); 

*** position of the first letter (or blank) of the new diagnosis;

startNewD = startFinD + 9;

*** new diagnosis variable;

diagnosis = STRIP[ SUBSTR(diagnosis, startNewD) ];

 

novinosrin
Tourmaline | Level 20
data have;
input file_number :$10.	diagnosis  & $100.;
cards;
123456789	file:123456789The patient has…
987654321	file:987654321 The patient has…
;

data want;
set have;
substr(diagnosis,1,14)=' ';
diagnosis=strip(diagnosis);
run;

 This assumes the pattern in diagnosis variable file:10 digit numbers is consistent

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2153 views
  • 1 like
  • 5 in conversation