I would like to import all the rows of a csv file, except the rows satisfyng some conditions together with the previous and/or following rows. For example, in the following example I would like to import all rows, except those
- in which var1 = paris, and the previous row.
- in which var1 = madrid, and both the previous and the following row.
data want;
infile datalines;
input
var1 $
var2
var3;
datalines;
abc 73 985
abc 132 163
paris 220 193
gft 685 741
cde 195 163
paris 220 198
cde 173 155
abe 135 116
madrid 735 161
" 173 155;
run;
It follows that the dataset want only contains the observations "abc 73 985" and "gft 685 741".
Any help is appreciated.
Thanks!!
Are you opposed to importing first and then doing the subsetting?
Ideally, I would like to do the subsetting while importing.
The reason is that the rows that I would like to leave out contain invalid data, and SAS throws errors in the log when I import them.
Thanks.
As shown the "gft 685 741" can't match your requirement as it is on the same row as 'paris'.
Please correct you datalines section.
Also if the " is supposed to represent a blank/missing value that is incorrect. Use a single period instead. The double quote becomes the value of a variable as does two single quotes.
Thank you for spotting that. For some reason I cannot write it in the correct way as a SAS code. But yes, it should be:
data want;
infile datalines;
input var1 $ var2 var3;
datalines;
abc 73 985
abc 132 163
paris 220 193
gft 685 741
cde 195 163
paris 220 198
cde 173 155
abe 135 116
madrid 735 161
" 173 155;
run;
The " could be the value of a variable, but if creates confusion I could replace it with anything else.
On the other hand, there could also be blank\missing values.
The following may be overkill, but it does let you read the data twice, get rid of the undesired records, yet result it no errors in the log:
data recs (keep=_recnum); infile '/folders/myfolders/datahave.txt' end=eof; do until (eof); input; rec_count+1; if scan(_infile_,1) eq 'paris' then do; _recnum=rec_count; output; _recnum=rec_count-1; output; end; else if scan(_infile_,1) eq 'madrid' then do; _recnum=rec_count; output; _recnum=rec_count-1; output; _recnum=rec_count+1; output; end; end; run; data want (drop=_:); array _drops (999999) _temporary_; do until (eof1); set recs end=eof1; _rec_num+1; _drops(_rec_num)=_recnum; end; _rec_num=0; infile '/folders/myfolders/datahave.txt' end=eof2; do until (eof2); _rec_num+1; input @; if _rec_num not in _drops then do; input var1$ var2 var3; output; end; else input; end; run;
Art, CEO, AnalystFinder.com
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.