DATA Step, Macro, Functions and more

Omitting an entire row if my first variable is not a number

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

Omitting an entire row if my first variable is not a number

If I were to prevent any rows of data from showing up in my output if my dattim2 variable is a character instead of a number, how would I do it?

 

input @4 numdate $1. @;
drop numdate;

if numdate in ('0','1','2','3','4','5','6','7','8','9') then
do;
input junk dattim2 $20. junk2 PM25 NO2 Precip PM10 WDir Humidity Temp Atm_pres Wspd_sclr Wspd_vect SO2 NOx O3 Wdir_sclr NO CO;
day = input(substr(dattim2,1,index(dattim2," ")-1),anydtdte10.);
time = input(substr(dattim2,index(dattim2," ")+1),anydtdtm10.);
dattim = dhms(day,0,0,time);
drop junk junk2 day time dattim2;
end;

 

I tried this, but for some reason my first 2 variables stopped showing up (dattim2 and PM25).

 

 

options DATESTYLE=MDY;

data burnaby;
format dattim2 datetime22.;
infile "C:\Users\Takuma\Documents\Burnaby_south_Feb2016.txt" firstobs=3 dsd delimiter="|";
input junk dattim $20. junk2 PM25 NO2 Precip PM10 Wdir_vct Humidity Temp Atm_pres Wspd_sclr Wspd_vect SO2 NOx O3 Wdir_sclr NO CO;
day = input(substr(dattim,1,index(dattim," ")-1),anydtdte10.);
time = input(substr(dattim,index(dattim," ")+1),anydtdtm10.);
dattim2 = dhms(day,0,0,time);
drop junk junk2 dattim day time;
run;


Accepted Solutions
Solution
‎06-25-2017 10:18 AM
Super User
Posts: 5,511

Re: Omitting an entire row if my first variable is not a number

Posted in reply to takumamih

While there may be plenty of additional issues with your program, here is one that may be the critical one.

 

The bottom program starts looking for the variable JUNK at the beginning of the line (column 1).

 

The top program starts looking for JUNK at column 5.  Since the early INPUT statement reads column 4 and then holds the line of data with a trailing @, the later INPUT statement resumes at column 5.  If that's the issue, it's easy enough to fix the INPUT statement to make it read:

 

input @1 junk ......

View solution in original post


All Replies
Super User
Posts: 19,820

Re: Omitting an entire row if my first variable is not a number

Posted in reply to takumamih

It's hard to follow that question. Please use the { i } icon (6&7th icon in the Rich Text editor) to include your code. 

 

I would suggest using two input statements. 

 

input junk dattim2 $20. @;

if <some condition> then do;
*what do you want to happen;
end;
else do;
*what else you want to happen;
end;

One of the sets of code inside the DO/END should likely be your full INPUT statement. 

 

You should also look at the ?? options in INPUT to suppress errors if that's why you're trying to do this.

PROC Star
Posts: 1,760

Re: Omitting an entire row if my first variable is not a number

Posted in reply to takumamih

A drop statement will always executed. It is a compile-time statement. So the fact that it appears in a IF THEN block makes no difference.

Not too sure what you mean by "prevent any rows of data from showing up in my output"  but maybe this is what you want:

data;
  input JUNK DATTIM2 $20. ....;
  DAY    = input(scan(DATTIM2,1,' '),?? anydtdte10.);
  TIME   = input(scan(DATTIM2,2,' '),?? anydtdtm10.);
  DATTIM = dhms(DAY,0,0,TIME);
  if missing(DATTIM) then delete;
Super User
Super User
Posts: 7,970

Re: Omitting an entire row if my first variable is not a number

Posted in reply to takumamih

Sorry, thats very hard to decipher.  Check out the guidance just to the left of Post button when posting a question.  Start by posting example test data in the form of a datastep.  Post example output required.  Explain the logic between the two.  For instance, I cannot understand from your text why you are not reading in the datetime variable from the file directly as a datetime variable?  This is creating all the work you have here.  This will exclude any value which has invalid data of course, however the data import you have with (you do have one don't you) should indicate what the data contains, and thus if the data does not conform to the import agreement (hence why this is so very important) then you rejected it and send it back to source, garbage data in will result in both a garbage process and garbage out.

Solution
‎06-25-2017 10:18 AM
Super User
Posts: 5,511

Re: Omitting an entire row if my first variable is not a number

Posted in reply to takumamih

While there may be plenty of additional issues with your program, here is one that may be the critical one.

 

The bottom program starts looking for the variable JUNK at the beginning of the line (column 1).

 

The top program starts looking for JUNK at column 5.  Since the early INPUT statement reads column 4 and then holds the line of data with a trailing @, the later INPUT statement resumes at column 5.  If that's the issue, it's easy enough to fix the INPUT statement to make it read:

 

input @1 junk ......

Respected Advisor
Posts: 3,156

Re: Omitting an entire row if my first variable is not a number

Posted in reply to takumamih

Another thought is that since you are doing list input and readig in  'dattim' as char anyway, why not read-in everything first, then check the first char of 'dattim', if not digit (may use ANYDIGIT/NOTDIGIT function), then DELETE. Doing so you don't have to count positions.

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 159 views
  • 2 likes
  • 6 in conversation