Hi,
my data is just a list of words (txt file) and I managed to import data into a table.
Since the lines have different number of words, the imported data have different number of columns.
The row with the most columns has 42 columns.
What I want to do is if a row has more than 10 columns (that is, if a row has more than 10 words) move the following columns to the next row
For example (5 columns in this case)
if I have this as my data,
a b c d e f g h i
a b c d
a b c d e f g
a b
i want to make
a b c d e
f g h i
a b c d
a b c d e
f g
a b
Can I do this in SAS?
It is not hard to do when just reading from a text. First let's make a sample text file.
options parmcards=txt;
filename txt temp;
parmcards4;
a b c d e f g h i
a b c d
a b c d e f g
a b
;;;;
Now just read 5 words at a time and hold the line. Stop when you have read past the end of the line.
data want ;
infile txt truncover length=ll column=cc ;
row+1;
do until(cc>ll);
input (v1-v5) (:$10.) @;
output;
end;
run;
Results:
Obs row v1 v2 v3 v4 v5 1 1 a b c d e 2 1 f g h i 3 2 a b c d 4 3 a b c d e 5 3 f g 6 4 a b
Hard to see a benefit of this somewhat arbitrary re-arranging of the data.
If you have a b c d e f g h i then you know f is in the sixth position.
If you modify this to a b c d e in row 1 and f g h i in row 2, then a in row 1 and f in row 2 are in the same columns and have the same variable name, and you lose the relationship between the character strings in the original data.
Why do you want to do this?
I understand that. I still don't see any benefit from doing this conversion of the data to have just 5 in a row. What can you do with this conversion that you wouldn't be able to do if you left it unchanged?
Well, I'm sure there is a way, but I don't have any brilliant ideas right now. Maybe some of the hash experts in the SAS Communities can do this quickly (but I am not one of them).
One variable would be better than 5 or 10.
If you have a unique row identifier just use PROC TRANSPOSE.
proc transpose data=have out=want;
by rowid;
var var1-var10;
run;
Check next tested code:
data have;
retain nwords;
infile cards truncover;
input text $char80.;
nwords = countw(text);
cards;
a b c d e f g h i
a b c d
a b c d e f g
a b
; run;
%let maxo=4;
data want(keep=wout1-wout&maxo);
length wout1-wout&maxo $5;
set have;
array wordot {&maxo} $ wout1-wout&maxo;
j=0; i=0;
do until (i=nwords);
do until (j=&maxo or i=nwords);
j = j+1; i=i+1; put i= j=;
wordot(j) = scan(text,i);
end;
if wout1 ne ' ' then output;
do j=1 to &maxo; wordot(j) = ' '; end;
j=0;
end;
run;
Why are you splitting the text into columns of one word each ?
Why not just splitting the long text into limited length sub-strings ?
It is not hard to do when just reading from a text. First let's make a sample text file.
options parmcards=txt;
filename txt temp;
parmcards4;
a b c d e f g h i
a b c d
a b c d e f g
a b
;;;;
Now just read 5 words at a time and hold the line. Stop when you have read past the end of the line.
data want ;
infile txt truncover length=ll column=cc ;
row+1;
do until(cc>ll);
input (v1-v5) (:$10.) @;
output;
end;
run;
Results:
Obs row v1 v2 v3 v4 v5 1 1 a b c d e 2 1 f g h i 3 2 a b c d 4 3 a b c d e 5 3 f g 6 4 a b
data want ; infile cards truncover ; row+1; input (v1-v5) (:$10.) @; do while( cmiss(of v1-v5) <5); output; input (v1-v5) (:$10.) @; end; cards; a b c d e f g h i a b c d a b c d e f g a b ;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.