- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm trying to combine the data from several rows into one and having issues. It really isn't proper SAS data as it spans multiple lines, and has no common attribute between the lines.
Its just a txt file that was output from a system we have, that I need to reformat into something more useable.
Essentially I'm looking to have the values for all columns reflected on all rows. Currently the values for Col4 have the values for Col1-3 reflected on a previous row. I need those values to all be on the same row.
My input file looks like this -
Col1 Col2 Col3 Col4
AAA FRED 123
xxxx,15
yyyy,15
zzzz,15
BBB FRED 234
xxxx,15
zzzz,23
CCC FRED 123
yyyy,11
zzzz,15
AAA BARNEY 123
yyyy,07
zzzz,42
BBB BARNEY 123
wwww,03
zzzz,33
I looking for an output file looking like this -
Col1 Col2 Col3 Col4
AAA FRED 123 xxxx,15
AAA FRED 123 yyyy,15
AAA FRED 123 zzzz,15
BBB FRED 234 xxxx,15
BBB FRED 234 zzzz,23
CCC FRED 123 yyyy,11
CCC FRED 123 zzzz,15
AAA BARNEY 123 yyyy,07
AAA BARNEY 123 zzzz,42
BBB BARNEY 123 wwww,03
BBB BARNEY 123 zzzz,33
I can't seem to wrap my head around what is needed to make that happen. Any assistance would be appreciated.
Ultimately I will then take that output and save it as a csv file such that it can be sent off to users who can perform their own filtering on the data within excel.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Using the UPDATE trick.
data have;
infile cards missover;
input (col1-col4)(:$16.);
retain dummyby 1;
cards;
AAA FRED 123
. . . xxxx,15
. . . yyyy,15
. . . zzzz,15
BBB FRED 234
. . . xxxx,15
. . . zzzz,23
CCC FRED 123
. . . yyyy,11
. . . zzzz,15
AAA BARNEY 123
. . . yyyy,07
. . . zzzz,42
BBB BARNEY 123
. . . wwww,03
. . . zzzz,33
;;;;
run;
proc print;
run;
data want;
update have(keep=dummyby obs=0) have;
by dummyby;
if not missing(col4) then output;
call missing(col4);
drop dummyby;
run;
proc print;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please repost the text data into a {i} window, so we can be sure about the layout.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Using the UPDATE trick.
data have;
infile cards missover;
input (col1-col4)(:$16.);
retain dummyby 1;
cards;
AAA FRED 123
. . . xxxx,15
. . . yyyy,15
. . . zzzz,15
BBB FRED 234
. . . xxxx,15
. . . zzzz,23
CCC FRED 123
. . . yyyy,11
. . . zzzz,15
AAA BARNEY 123
. . . yyyy,07
. . . zzzz,42
BBB BARNEY 123
. . . wwww,03
. . . zzzz,33
;;;;
run;
proc print;
run;
data want;
update have(keep=dummyby obs=0) have;
by dummyby;
if not missing(col4) then output;
call missing(col4);
drop dummyby;
run;
proc print;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks. Though it looks like I have another issue with my data. I don't have the periods representing the empty values as below. Without those the col4 data gets shifted over to col1, so nothing gets output.
Here I have removed the periods from a couple of the entries -
data have; infile cards missover; input (col1-col4)(:$16.); retain dummyby 1; cards; AAA FRED 123 . . . xxxx,15 . . . yyyy,15 . . . zzzz,15 BBB FRED 234 xxxx,15 zzzz,23 CCC FRED 123 . . . yyyy,11 . . . zzzz,15 AAA BARNEY 123 yyyy,07 zzzz,42 BBB BARNEY 123 . . . wwww,03 . . . zzzz,33 ;;;; run; proc print; run; data want; update have(keep=dummyby obs=0) have; by dummyby; if not missing(col4) then output; call missing(col4); drop dummyby; run; proc print; run;
This is what it now looks like -
Obs col1 col2 col3 col4 dummyby 1 AAA FRED 123 1 2 xxxx,15 1 3 yyyy,15 1 4 zzzz,15 1 5 BBB FRED 234 1 6 xxxx,15 1 7 zzzz,23 1 8 CCC FRED 123 1 9 yyyy,11 1 10 zzzz,15 1 11 AAA BARNEY 123 1 12 yyyy,07 1 13 zzzz,42 1 14 BBB BARNEY 123 1 15 wwww,03 1 16 zzzz,33 1
which then doesn't return those entries(BBB FRED, AAA BARNEY) -
Obs col1 col2 col3 col4 1 AAA FRED 123 xxxx,15 2 AAA FRED 123 yyyy,15 3 AAA FRED 123 zzzz,15 4 CCC FRED 123 yyyy,11 5 CCC FRED 123 zzzz,15 6 BBB BARNEY 123 wwww,03 7 BBB BARNEY 123 zzzz,33
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
That just means your INPUT statement will be different.
Looks like your data is in fixed columns so your input statement will look more like:
input col1 $ 1-12 col2 $ 13-24 col3 $25-40 col4 41-50 ;
You will need to look at your data file and figure out the proper column/character numbers for each variable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Perfect. thanks for the direction here.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I missed the bit about your input being a FILE. I assumed it was already a SAS data set. This is an example of reading the file directly to obtain the desired output. Notice the use of RETAIN; to retain all variables created in the data step.
data want;
retain;
infile cards missover;
input nibble $3. @1 @;
if not missing(nibble) then do;
input (col1-col3)(:$8.);
delete;
end;
input col4 :$8.;
drop nibble;
cards;
AAA FRED 123
xxxx,15
yyyy,15
zzzz,15
BBB FRED 234
xxxx,15
zzzz,23
CCC FRED 123
yyyy,11
zzzz,15
AAA BARNEY 123
yyyy,07
zzzz,42
BBB BARNEY 123
wwww,03
zzzz,33
;;;;
run;
proc print;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Great. Thanks for this.