BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mehul4frnds
Obsidian | Level 7

Hello,

 

I want to extract only string or characters from the variable. 

 

my data looks like this:

 

data have1;
input id name$ gen$ ID DATE A B C D$;
datalines;
1 ram male 1 1.1 1 . 1.23456 1/n
2 mohan male 1 1.1 99 1 1.23456 1.1/n
3 shyam male 2 2.1 1 99 1.23456 16/dr
4 radha female 2 2.1 90 1 99 2.00/mn
5 gita female 2 2.1 99 95 1.23456 ?/prp
6 sita female 2 3.1 . 1 1.23456 10mg
7 ranu female 3 2.1 1 . 1.23456 1000mg
8 tina female 3 2.1 . 1 1.23456 n/q
9 shan male 4 8.1 1 1 1.6789 2:3
;
run;

 

I want data look like this:

 

data have1;
id    name    gen      ID     DATE   A   B   C               D      E ;
1     ram       male    1      1.1       1    .    1.23456     1        n
2     mohan  male    1      1.1      99   1   1.23456     1        n
3     shyam   male    2      2.1       1  99   1.23456  16        dr
4     radha    female 2      2.1      90    1 99               2.00   mn
5     gita       female  2     2.1       99 95    1.23456   ?        prp
6     sita       female  2     3.1          .   1    1.23456  10       mg
7     ranu      female  3    2.1          1   .   1.23456 1000     mg
8     tina       female  3     2.1          .   1   1.23456             nq
9     shan     male     4     8.1          1  1   1.6789      2

I have tried following code

data want; set have1;

_numb= scan (D,1,'/','A'); 

run;

This removes all the string characters. But I need a separate variable with remaining characters (Also want to remove '?' from the dataset )

 

Kindly help.

 

Thanks in advance

 

1 ACCEPTED SOLUTION

Accepted Solutions
FriedEgg
SAS Employee
data have1;
input id name $ gen $ ID DATE A B C _D $;
D=scan(_D,1,'/:','a');
E=compress(_D,'/:' || D,'d');
drop _D;
cards;
1 ram male 1 1.1 1 . 1.23456 1/n
2 mohan male 1 1.1 99 1 1.23456 1.1/n
3 shyam male 2 2.1 1 99 1.23456 16/dr
4 radha female 2 2.1 90 1 99 2.00/mn
5 gita female 2 2.1 99 95 1.23456 ?/prp
6 sita female 2 3.1 . 1 1.23456 10mg
7 ranu female 3 2.1 1 . 1.23456 1000mg
8 tina female 3 2.1 . 1 1.23456 n/q
9 shan male 4 8.1 1 1 1.6789 2:3
;
run;
Obs id name gen DATE A B C D E
1 1 ram male 1.1 1 . 1.2346 1 n
2 1 mohan male 1.1 99 1 1.2346 1.1 n
3 2 shyam male 2.1 1 99 1.2346 16 dr
4 2 radha female 2.1 90 1 99.0000 2.00 mn
5 2 gita female 2.1 99 95 1.2346 ? prp
6 2 sita female 3.1 . 1 1.2346 10 mg
7 3 ranu female 2.1 1 . 1.2346 1000 mg
8 3 tina female 2.1 . 1 1.2346   nq
9 4 shan male 8.1 1 1 1.6789 2  

View solution in original post

10 REPLIES 10
HB
Barite | Level 11 HB
Barite | Level 11

Should the last record be 

 

9     shan     male     4     8.1          1  1   1.6789      2     3

 

or does the : instead of the / mean disregard what is after?

 

 

If you are looking to get rid of ?'s what should this records look like?

 

5     gita       female  2     2.1       99 95    1.23456   ?        prp

 

Should it be 

5     gita       female  2     2.1       99 95    1.23456             prp

 

mehul4frnds
Obsidian | Level 7

I want to separate numbers and characters as sometimes string characters are present before / also

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Use scan():

data want;
  set have1;
  e=scan(d,2,"/");
  d=scan(d,1,"/");
run;
HB
Barite | Level 11 HB
Barite | Level 11

I don't think that will split 1000mg into 1000 and mg.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

In which case you need to two logical operations, one for delimited, one for not:

data want;
  set have1;
if index(d,"/") then do;
e=scan(d,2,"/"); d=scan(d,1,"/");
end;
else do;
e=compress(d," ","d");
d=tranwrd(d,e,"");
end; run;

So first, if a / appears use that, else compress out all numbers and set that as e, then remove e from d. 

mehul4frnds
Obsidian | Level 7

Thanks for the reply. I have tried scan() function, but it does not work for those instances where '/ ' is not there such as row number 6,7,and 9.  As It removes all the characters instead.

novinosrin
Tourmaline | Level 20
data have1;
input id name$ gen$ ID DATE A B C D$;
datalines;
1 ram male 1 1.1 1 . 1.23456 1/n
2 mohan male 1 1.1 99 1 1.23456 1.1/n
3 shyam male 2 2.1 1 99 1.23456 16/dr
4 radha female 2 2.1 90 1 99 2.00/mn
5 gita female 2 2.1 99 95 1.23456 ?/prp
6 sita female 2 3.1 . 1 1.23456 10mg
7 ranu female 3 2.1 1 . 1.23456 1000mg
8 tina female 3 2.1 . 1 1.23456 n/q
9 shan male 4 8.1 1 1 1.6789 2:3
;
run;

data want;
set have1;
d1=compress(d,' ','kd');
e=compress(d,'?/','d');
run;
FriedEgg
SAS Employee
data have1;
input id name $ gen $ ID DATE A B C _D $;
D=scan(_D,1,'/:','a');
E=compress(_D,'/:' || D,'d');
drop _D;
cards;
1 ram male 1 1.1 1 . 1.23456 1/n
2 mohan male 1 1.1 99 1 1.23456 1.1/n
3 shyam male 2 2.1 1 99 1.23456 16/dr
4 radha female 2 2.1 90 1 99 2.00/mn
5 gita female 2 2.1 99 95 1.23456 ?/prp
6 sita female 2 3.1 . 1 1.23456 10mg
7 ranu female 3 2.1 1 . 1.23456 1000mg
8 tina female 3 2.1 . 1 1.23456 n/q
9 shan male 4 8.1 1 1 1.6789 2:3
;
run;
Obs id name gen DATE A B C D E
1 1 ram male 1.1 1 . 1.2346 1 n
2 1 mohan male 1.1 99 1 1.2346 1.1 n
3 2 shyam male 2.1 1 99 1.2346 16 dr
4 2 radha female 2.1 90 1 99.0000 2.00 mn
5 2 gita female 2.1 99 95 1.2346 ? prp
6 2 sita female 3.1 . 1 1.2346 10 mg
7 3 ranu female 2.1 1 . 1.2346 1000 mg
8 3 tina female 2.1 . 1 1.2346   nq
9 4 shan male 8.1 1 1 1.6789 2  
HB
Barite | Level 11 HB
Barite | Level 11

I think making that  

'/:?'

 

gets OP to the desired result (dropping questions marks).

 

Although I do think 16/nq becoming 16 and nq while n/q becomes " " and nq  is a little strange.   

FriedEgg
SAS Employee

I agree, but it is what the OP showed as the desired output.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1712 views
  • 5 likes
  • 5 in conversation