BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Brad5000
Calcite | Level 5

Hi,

I'm trying to read values from a text file but unfortunately they're held in pretty weird format, for example the names 'Pete Smith' and 'John Brown' could be held like this;

 

name begin
pe
te
smi
th
end name
begin name
joh
n
br
o
wn
end name

Is it possible to read these into a dataset using a single input statement or would it need a combination of several inputs? Any advice would be appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

OK, here's one way.

 

data want;

length name $ 30;

retain name;

length segment $ 20;

input segment &;

if segment='begin name' then name=' ';

else if segment='end name' then output;

else name = cats(name, segment);

run;

 

But the problem of identifying the separation point between first and last name still exists.

View solution in original post

7 REPLIES 7
Astounding
PROC Star

A DATA step could handle this except for one feature.  How do you know when the first name ends and the last name begins?

Brad5000
Calcite | Level 5

Hi, yes that's a problem, I think for now I would just like to get them into a single variable.

Astounding
PROC Star

OK, here's one way.

 

data want;

length name $ 30;

retain name;

length segment $ 20;

input segment &;

if segment='begin name' then name=' ';

else if segment='end name' then output;

else name = cats(name, segment);

run;

 

But the problem of identifying the separation point between first and last name still exists.

Brad5000
Calcite | Level 5

Thanks for that, works well with a Retain so is the best solution for now I think

 

data want;

length name $ 30;

length segment $ 20;

input segment &;
retain name;

if segment='begin name' then name=' ';

else if segment='end name' then 
   do;
      drop segment;
      output;
   end;
else name = cats(name, segment);



datalines;
begin name
pe
te
smi
th
end name
begin name
joh
n
br
o
wn
end name

run;
 
Ksharp
Super User

Attach a TEXT file to let us test it.

Kurt_Bremser
Super User

With data like that: impossible.

Simply because you cannot state a clear rule where words are separated; one coould only read in the names as strings without delimiters.

You first need a clear rule where the given name ends and the surname begins, then you can think about code.

The code would then actually be very simple.

Brad5000
Calcite | Level 5
Yes that is an issue, for now getting everything into a single variable will have to do.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1686 views
  • 1 like
  • 4 in conversation