BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Brad5000
Calcite | Level 5

Hi,

I'm trying to read values from a text file but unfortunately they're held in pretty weird format, for example the names 'Pete Smith' and 'John Brown' could be held like this;

 

name begin
pe
te
smi
th
end name
begin name
joh
n
br
o
wn
end name

Is it possible to read these into a dataset using a single input statement or would it need a combination of several inputs? Any advice would be appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

OK, here's one way.

 

data want;

length name $ 30;

retain name;

length segment $ 20;

input segment &;

if segment='begin name' then name=' ';

else if segment='end name' then output;

else name = cats(name, segment);

run;

 

But the problem of identifying the separation point between first and last name still exists.

View solution in original post

7 REPLIES 7
Astounding
PROC Star

A DATA step could handle this except for one feature.  How do you know when the first name ends and the last name begins?

Brad5000
Calcite | Level 5

Hi, yes that's a problem, I think for now I would just like to get them into a single variable.

Astounding
PROC Star

OK, here's one way.

 

data want;

length name $ 30;

retain name;

length segment $ 20;

input segment &;

if segment='begin name' then name=' ';

else if segment='end name' then output;

else name = cats(name, segment);

run;

 

But the problem of identifying the separation point between first and last name still exists.

Brad5000
Calcite | Level 5

Thanks for that, works well with a Retain so is the best solution for now I think

 

data want;

length name $ 30;

length segment $ 20;

input segment &;
retain name;

if segment='begin name' then name=' ';

else if segment='end name' then 
   do;
      drop segment;
      output;
   end;
else name = cats(name, segment);



datalines;
begin name
pe
te
smi
th
end name
begin name
joh
n
br
o
wn
end name

run;
 
Ksharp
Super User

Attach a TEXT file to let us test it.

Kurt_Bremser
Super User

With data like that: impossible.

Simply because you cannot state a clear rule where words are separated; one coould only read in the names as strings without delimiters.

You first need a clear rule where the given name ends and the surname begins, then you can think about code.

The code would then actually be very simple.

Brad5000
Calcite | Level 5
Yes that is an issue, for now getting everything into a single variable will have to do.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1204 views
  • 1 like
  • 4 in conversation