BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
serge68
Calcite | Level 5

I'm trying to combine the data from several rows into one and having issues. It really isn't proper SAS data as it spans multiple lines, and has no common attribute between the lines.

 

Its just a txt file that was output from a system we have, that I need to reformat into something more useable.

 

Essentially I'm looking to have the values for all columns reflected on all rows. Currently the values for Col4 have the values for Col1-3 reflected on a previous row. I need those values to all be on the same row.  

 

My input file looks like this - 

 

Col1                Col2             Col3    Col4

AAA            FRED                123

                                                            xxxx,15

                                                             yyyy,15

                                                            zzzz,15

BBB            FRED                234

                                                            xxxx,15

                                                             zzzz,23

CCC            FRED                123

                                                            yyyy,11

                                                            zzzz,15

AAA            BARNEY          123

                                                            yyyy,07

                                                            zzzz,42

BBB            BARNEY          123

                                                            wwww,03

                                                            zzzz,33

 

 

I looking for an output file looking like this - 

 

Col1                Col2             Col3    Col4

AAA            FRED                123     xxxx,15

AAA            FRED                123     yyyy,15

AAA            FRED                123     zzzz,15

BBB            FRED                234     xxxx,15

BBB            FRED                234     zzzz,23

CCC            FRED                123     yyyy,11

CCC            FRED                123     zzzz,15

AAA           BARNEY           123     yyyy,07

AAA           BARNEY           123     zzzz,42

BBB           BARNEY           123     wwww,03

BBB           BARNEY           123     zzzz,33

 

 

I can't seem to wrap my head around what is needed to make that happen. Any assistance would be appreciated.

 

Ultimately I will then take that output and save it as a csv file such that it can be sent off to users who can perform their own filtering on the data within excel.

1 ACCEPTED SOLUTION

Accepted Solutions
data_null__
Jade | Level 19

Using the UPDATE trick.

 

data have;
   infile cards missover;
   input (col1-col4)(:$16.);
   retain dummyby 1;
   cards;
AAA            FRED                123
  .               .                  .                      xxxx,15
  .               .                  .                       yyyy,15
  .               .                  .                      zzzz,15
BBB            FRED                234
  .               .                  .                      xxxx,15
  .               .                  .                       zzzz,23
CCC            FRED                123
  .               .                  .                      yyyy,11
  .               .                  .                      zzzz,15
AAA            BARNEY          123
  .                .             .                          yyyy,07
  .                .             .                          zzzz,42
BBB            BARNEY          123
  .                .             .                          wwww,03
  .                .             .                          zzzz,33
;;;;
   run;
   
proc print;
   run;     
data want;
   update have(keep=dummyby obs=0) have;
   by dummyby;
   if not missing(col4) then output;
   call missing(col4);
   drop dummyby;
   run;
proc print;
   run; 

Capture.PNGCapture2.PNG

View solution in original post

7 REPLIES 7
data_null__
Jade | Level 19

Using the UPDATE trick.

 

data have;
   infile cards missover;
   input (col1-col4)(:$16.);
   retain dummyby 1;
   cards;
AAA            FRED                123
  .               .                  .                      xxxx,15
  .               .                  .                       yyyy,15
  .               .                  .                      zzzz,15
BBB            FRED                234
  .               .                  .                      xxxx,15
  .               .                  .                       zzzz,23
CCC            FRED                123
  .               .                  .                      yyyy,11
  .               .                  .                      zzzz,15
AAA            BARNEY          123
  .                .             .                          yyyy,07
  .                .             .                          zzzz,42
BBB            BARNEY          123
  .                .             .                          wwww,03
  .                .             .                          zzzz,33
;;;;
   run;
   
proc print;
   run;     
data want;
   update have(keep=dummyby obs=0) have;
   by dummyby;
   if not missing(col4) then output;
   call missing(col4);
   drop dummyby;
   run;
proc print;
   run; 

Capture.PNGCapture2.PNG

serge68
Calcite | Level 5

Thanks. Though it looks like I have another issue with my data. I don't have the periods representing the empty values as below. Without those the col4 data gets shifted over to col1, so nothing gets output.

 

Here I have removed the periods from a couple of the entries - 

 

data have;                                                           
   infile cards missover;                                            
   input (col1-col4)(:$16.);                                         
   retain dummyby 1;                                                 
   cards;                                                            
AAA            FRED                123                               
  .               .                  .                      xxxx,15  
  .               .                  .                      yyyy,15 
  .               .                  .                      zzzz,15  
BBB            FRED                234                               
                                                            xxxx,15  
                                                            zzzz,23 
CCC            FRED                123                               
  .               .                  .                      yyyy,11  
  .               .                  .                      zzzz,15  
AAA            BARNEY          123                                   
                                                            yyyy,07  
                                                            zzzz,42  
BBB            BARNEY          123                                   
  .                .             .                          wwww,03  
  .                .             .                          zzzz,33  
;;;;                                                                 
   run;                                                              
                                                                     
proc print;                                                          
   run;                                                              
data want;                                                           
   update have(keep=dummyby obs=0) have;                             
   by dummyby;                                                       
   if not missing(col4) then output;                                 
   call missing(col4);                                               
   drop dummyby;                                                     
   run;                                                              
proc print;                                                          
   run;                                                              

This is what it now looks like - 

 

Obs    col1        col2     col3     col4      dummyby 
                                                       
  1    AAA        FRED      123                   1    
  2                                 xxxx,15       1    
  3                                 yyyy,15       1    
  4                                 zzzz,15       1    
  5    BBB        FRED      234                   1    
  6    xxxx,15                                    1    
  7    zzzz,23                                    1    
  8    CCC        FRED      123                   1    
  9                                 yyyy,11       1    
 10                                 zzzz,15       1    
 11    AAA        BARNEY    123                   1    
 12    yyyy,07                                    1    
 13    zzzz,42                                    1    
 14    BBB        BARNEY    123                   1    
 15                                 wwww,03       1    
 16                                 zzzz,33       1    

which then doesn't return those entries(BBB FRED, AAA BARNEY) - 

 

Obs    col1     col2     col3     col4       
                                             
 1     AAA     FRED      123     xxxx,15     
 2     AAA     FRED      123     yyyy,15     
 3     AAA     FRED      123     zzzz,15     
 4     CCC     FRED      123     yyyy,11     
 5     CCC     FRED      123     zzzz,15     
 6     BBB     BARNEY    123     wwww,03     
 7     BBB     BARNEY    123     zzzz,33     

 

Thanks

Tom
Super User Tom
Super User

That just means your INPUT statement will be different.

Looks like your data is in fixed columns so your input statement will look more like:

 

input col1 $ 1-12 col2 $ 13-24 col3 $25-40 col4 41-50 ;

You will need to look at your data file and figure out the proper column/character numbers for each variable.

serge68
Calcite | Level 5

Perfect. thanks for the direction here.

 

data_null__
Jade | Level 19

I missed the bit about your input being a FILE.  I assumed it was already a SAS data set.  This is an example of reading the file directly to obtain the desired output.  Notice the use of RETAIN;  to retain all variables created in the data step.

 

data want;
   retain;
   infile cards missover;
   input nibble $3. @1 @;
   if not missing(nibble) then do;
      input (col1-col3)(:$8.);
      delete;
      end;
   input col4 :$8.;
   drop nibble;
   cards;
AAA            FRED                123
                                                            xxxx,15
                                                             yyyy,15
                                                            zzzz,15
BBB            FRED                234
                                                            xxxx,15
                                                             zzzz,23
CCC            FRED                123
                                                            yyyy,11
                                                            zzzz,15
AAA            BARNEY          123
                                                            yyyy,07
                                                            zzzz,42
BBB            BARNEY          123
                                                            wwww,03
                                                            zzzz,33
;;;;
   run;
proc print;
   run;

Capture.PNG

serge68
Calcite | Level 5

Great. Thanks for this.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 845 views
  • 2 likes
  • 4 in conversation