BookmarkSubscribeRSS Feed
user_
Calcite | Level 5

I would like to import all the rows of a csv file, except the rows satisfyng some conditions together with the previous and/or following rows. For example, in the following example I would like to import all rows, except those 

- in which var1 = paris, and the previous row.

- in which var1 = madrid, and both the previous and the following row.

 

 data want;
 
 infile datalines;
 input 
   var1 $
   var2
   var3;
 datalines;
 abc 73 985
 abc 132  163
 paris 220  193 
gft 685 741 cde 195 163 paris 220 198 cde 173 155 abe 135 116 madrid 735 161 " 173 155; run;

 

It follows that the dataset want only contains the observations "abc 73 985" and "gft 685 741".

 

Any help is appreciated.

 

Thanks!!

5 REPLIES 5
collinelliot
Barite | Level 11

Are you opposed to importing first and then doing the subsetting?

user_
Calcite | Level 5

Ideally, I would like to do the subsetting while importing.

The reason is that the rows that I would like to leave out contain invalid data, and SAS throws errors in the log when I import them.

 

Thanks.

ballardw
Super User

As shown the "gft 685 741" can't match your requirement as it is on the same row as 'paris'.

Please correct you datalines section.

 

Also if the " is supposed to represent a blank/missing value that is incorrect. Use a single period instead. The double quote becomes the value of a variable as does two single quotes.

user_
Calcite | Level 5

Thank you for spotting that. For some reason I cannot write it in the correct way as a SAS code. But yes, it should be:

 

data want;

infile datalines;

input var1 $ var2 var3;

datalines;

abc 73 985

abc 132 163

paris 220 193
gft 685 741

cde 195 163

paris 220 198

cde 173 155

abe 135 116

madrid 735 161

" 173 155;

run;

 

The " could be the value of a variable, but if creates confusion I could replace it with anything else.

On the other hand, there could also be blank\missing values.

art297
Opal | Level 21

The following may be overkill, but it does let you read the data twice, get rid of the undesired records, yet result it no errors in the log:

 

data recs (keep=_recnum);
  infile '/folders/myfolders/datahave.txt' end=eof;
  do until (eof);
    input;
    rec_count+1;
    if scan(_infile_,1) eq 'paris' then do;
      _recnum=rec_count;
      output;
      _recnum=rec_count-1;
      output;
    end;
    else if scan(_infile_,1) eq 'madrid' then do;
      _recnum=rec_count;
      output;
      _recnum=rec_count-1;
      output;
      _recnum=rec_count+1;
      output;
    end;
  end;
run;
data want (drop=_:);
  array _drops (999999) _temporary_;
  do until (eof1);
    set recs end=eof1;
    _rec_num+1;
    _drops(_rec_num)=_recnum;
  end;
  _rec_num=0;
  infile '/folders/myfolders/datahave.txt' end=eof2;
  do until (eof2);
    _rec_num+1;
    input @;
    if _rec_num not in _drops then do;
      input var1$ var2 var3;
      output;
    end;
    else input;
  end;
run;

Art, CEO, AnalystFinder.com

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1005 views
  • 0 likes
  • 4 in conversation