DATA Step, Macro, Functions and more

Reading records spread over a varying number of lines

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

Reading records spread over a varying number of lines

Hi,

I'm trying to read values from a text file but unfortunately they're held in pretty weird format, for example the names 'Pete Smith' and 'John Brown' could be held like this;

 

name begin
pe
te
smi
th
end name
begin name
joh
n
br
o
wn
end name

Is it possible to read these into a dataset using a single input statement or would it need a combination of several inputs? Any advice would be appreciated.


Accepted Solutions
Solution
‎06-13-2016 03:35 AM
Super User
Posts: 5,082

Re: Reading records spread over a varying number of lines

[ Edited ]

OK, here's one way.

 

data want;

length name $ 30;

retain name;

length segment $ 20;

input segment &;

if segment='begin name' then name=' ';

else if segment='end name' then output;

else name = cats(name, segment);

run;

 

But the problem of identifying the separation point between first and last name still exists.

View solution in original post


All Replies
Super User
Posts: 5,082

Re: Reading records spread over a varying number of lines

A DATA step could handle this except for one feature.  How do you know when the first name ends and the last name begins?

Occasional Contributor
Posts: 6

Re: Reading records spread over a varying number of lines

Hi, yes that's a problem, I think for now I would just like to get them into a single variable.

Solution
‎06-13-2016 03:35 AM
Super User
Posts: 5,082

Re: Reading records spread over a varying number of lines

[ Edited ]

OK, here's one way.

 

data want;

length name $ 30;

retain name;

length segment $ 20;

input segment &;

if segment='begin name' then name=' ';

else if segment='end name' then output;

else name = cats(name, segment);

run;

 

But the problem of identifying the separation point between first and last name still exists.

Occasional Contributor
Posts: 6

Re: Reading records spread over a varying number of lines

Thanks for that, works well with a Retain so is the best solution for now I think

 

data want;

length name $ 30;

length segment $ 20;

input segment &;
retain name;

if segment='begin name' then name=' ';

else if segment='end name' then 
   do;
      drop segment;
      output;
   end;
else name = cats(name, segment);



datalines;
begin name
pe
te
smi
th
end name
begin name
joh
n
br
o
wn
end name

run;
 
Super User
Posts: 9,681

Re: Reading records spread over a varying number of lines

Attach a TEXT file to let us test it.

Super User
Posts: 6,938

Re: Reading records spread over a varying number of lines

With data like that: impossible.

Simply because you cannot state a clear rule where words are separated; one coould only read in the names as strings without delimiters.

You first need a clear rule where the given name ends and the surname begins, then you can think about code.

The code would then actually be very simple.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 6

Re: Reading records spread over a varying number of lines

Yes that is an issue, for now getting everything into a single variable will have to do.
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 308 views
  • 1 like
  • 4 in conversation