Help using Base SAS procedures

deleting row with missing observation

Reply
Contributor
Posts: 21

deleting row with missing observation

data ResidentialSalesAuckland;
infile 'H:\stats301\A2\ResidentialSalesAuckland.dat';

input Roadname $27. Suburb $28-55 Townorcity $56-65 TLA $66-80 District $81-90
ListPrice 91-99 SalePrice 100-107 @108 ListDate ArrementDate UnconditionalDate @139 DaysOnMarket 7.
Bedrooms 147 LandArea 6. New $2. Valuation 157-165 ValuationYear 4.;

Informat Listdate mmddyy10. ArrementDate mmddyy10. UnconditionalDate mmddyy10.;
run;

proc print data=ResidentialSalesAuckland (obs=20);

title 'residential';
format ListDate date9. ArrementDate date9. UnconditionalDate date9.;
run;

Hi, I'm trying to delete the row with missing observation but i don't know why it's not working.

 

I have two questions to ask...

 

 

1. The first code before i put function 'if ListPrice =. then delete;'  was looking like above. 

There are some errors showed up too like this but what's the problem with 'UnconditionalDate'??

 

this is the error from the first code:

 

0 data ResidentialSalesAuckland;
451 infile 'H:\stats301\A2\ResidentialSalesAuckland.dat';
452
453 input Roadname $27. Suburb $28-55 Townorcity $56-65 TLA $66-80 District $81-90
454 ListPrice 91-99 SalePrice 100-107 @108 ListDate ArrementDate UnconditionalDate @139
454! DaysOnMarket 7.
455 Bedrooms 147 LandArea 6. New $2. Valuation 157-165 ValuationYear 4.;
456
457 Informat Listdate mmddyy10. ArrementDate mmddyy10. UnconditionalDate mmddyy10.;
458 run;

NOTE: The infile 'H:\stats301\A2\ResidentialSalesAuckland.dat' is:
Filename=H:\stats301\A2\ResidentialSalesAuckland.dat,
RECFM=V,LRECL=32767,File Size (bytes)=120390,
Last Modified=17 August 2017 10:29:46,
Create Time=17 August 2017 10:29:45

NOTE: Invalid data for UnconditionalDate in line 13 147-147.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+---
13 Newell St Point Chevalier Auckland City Auckland
89 624000 4/29/2005 5/16/2005 3 N 169
Roadname=Newell St Suburb=Point Chevalier Townorcity= TLA=Auckland City District=Auckland
ListPrice=. SalePrice=624000 ListDate=16555 ArrementDate=16572 UnconditionalDate=. DaysOnMarket=.
Bedrooms=3 LandArea=. New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=13
NOTE: Invalid data for UnconditionalDate in line 22 147-147.
22 Proteus Pl Greenlane Auckland City Auckland
89 630000 4/5/2005 4/30/2005 3 N 169
Roadname=Proteus Pl Suburb=Greenlane Townorcity= TLA=Auckland City District=Auckland ListPrice=.
SalePrice=630000 ListDate=16531 ArrementDate=16556 UnconditionalDate=. DaysOnMarket=. Bedrooms=3
LandArea=. New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=22
NOTE: Invalid data for UnconditionalDate in line 37 155-155.
37 Grangewood Cl Mangere Manukau City Auckland
89 272000 272000 4/7/2005 4/30/2005 N 169
Roadname=Grangewood Cl Suburb=Mangere Townorcity= TLA=Manukau City District=Auckland
ListPrice=272000 SalePrice=272000 ListDate=16533 ArrementDate=16556 UnconditionalDate=.
DaysOnMarket=. Bedrooms=. LandArea=. New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=37
NOTE: Invalid data for UnconditionalDate in line 69 152-152.
69 Edwards Rd Grey Lynn Auckland Auckland City Auckland
89 580000 580000 4/5/2005 5/31/2005 0 N 169
Roadname=Edwards Rd Suburb=Grey Lynn Townorcity=Auckland TLA=Auckland City District=Auckland
ListPrice=580000 SalePrice=580000 ListDate=16531 ArrementDate=16587 UnconditionalDate=.
DaysOnMarket=. Bedrooms=. LandArea=0 New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=69
NOTE: Invalid data for UnconditionalDate in line 70 147-147.
70 Seascape Rd Remuera Auckland City Auckland
89 866000 4/13/2005 4/30/2005 3 839 N 169
Roadname=Seascape Rd Suburb=Remuera Townorcity= TLA=Auckland City District=Auckland ListPrice=.
SalePrice=866000 ListDate=16539 ArrementDate=16556 UnconditionalDate=. DaysOnMarket=. Bedrooms=3
LandArea=839 New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=70
NOTE: Invalid data for UnconditionalDate in line 106 155-155.
106 Skinner Rd Mt Wellington Auckland City Auckland
89 282500 282500 4/19/2005 4/30/2005 N 169
Roadname=Skinner Rd Suburb=Mt Wellington Townorcity= TLA=Auckland City District=Auckland
ListPrice=282500 SalePrice=282500 ListDate=16545 ArrementDate=16556 UnconditionalDate=.
DaysOnMarket=. Bedrooms=. LandArea=. New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=106
NOTE: Invalid data for UnconditionalDate in line 107 155-155.
107 Carter Pl Manurewa Manukau City Auckland
89 210000 210000 4/1/2005 4/30/2005 N 169
Roadname=Carter Pl Suburb=Manurewa Townorcity= TLA=Manukau City District=Auckland
ListPrice=210000 SalePrice=210000 ListDate=16527 ArrementDate=16556 UnconditionalDate=.
DaysOnMarket=. Bedrooms=. LandArea=. New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=107
NOTE: Invalid data for UnconditionalDate in line 115 147-147.
115 Richbourne St Kingsland Auckland City Auckland
89 510000 4/20/2005 4/30/2005 3 169
Roadname=Richbourne St Suburb=Kingsland Townorcity= TLA=Auckland City District=Auckland
ListPrice=. SalePrice=510000 ListDate=16546 ArrementDate=16556 UnconditionalDate=. DaysOnMarket=.
Bedrooms=3 LandArea=. New= Valuation=. ValuationYear=. _ERROR_=1 _N_=115
NOTE: Invalid data for UnconditionalDate in line 116 150-152.
116 Giani Court Manurewa Manukau City Auckland
89 349000 4/20/2005 4/29/2005 400 Y 169
Roadname=Giani Court Suburb=Manurewa Townorcity= TLA=Manukau City District=Auckland ListPrice=.
SalePrice=349000 ListDate=16546 ArrementDate=16555 UnconditionalDate=. DaysOnMarket=. Bedrooms=.
LandArea=400 New=Y Valuation=. ValuationYear=. _ERROR_=1 _N_=116
NOTE: Invalid data for UnconditionalDate in line 117 150-152.
117 Giani Court Manurewa Manukau City Auckland
89 349000 4/20/2005 4/29/2005 400 N 169
Roadname=Giani Court Suburb=Manurewa Townorcity= TLA=Manukau City District=Auckland ListPrice=.
SalePrice=349000 ListDate=16546 ArrementDate=16555 UnconditionalDate=. DaysOnMarket=. Bedrooms=.
LandArea=400 New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=117
NOTE: Invalid data for UnconditionalDate in line 142 147-147.
142 Dalry Pl Mangere Bridge Manukau City Auckland
89 390000 4/8/2005 4/30/2005 3 663 N 169
Roadname=Dalry Pl Suburb=Mangere Bridge Townorcity= TLA=Manukau City District=Auckland
ListPrice=. SalePrice=390000 ListDate=16534 ArrementDate=16556 UnconditionalDate=. DaysOnMarket=.
Bedrooms=3 LandArea=663 New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=142
NOTE: Invalid data for UnconditionalDate in line 162 147-147.
162 Mt Pleasant Rd Mt Eden Auckland Auckland City Auckland
89 760000 4/8/2005 4/30/2005 3 506 N 655000 2002 169
Roadname=Mt Pleasant Rd Suburb=Mt Eden Townorcity=Auckland TLA=Auckland City District=Auckland
ListPrice=. SalePrice=760000 ListDate=16534 ArrementDate=16556 UnconditionalDate=. DaysOnMarket=.
Bedrooms=3 LandArea=506 New=N Valuation=655000 ValuationYear=2002 _ERROR_=1 _N_=162
NOTE: Invalid data for UnconditionalDate in line 163 155-155.
163 Walters Rd Mt Wellington Auckland City Auckland
89 295000 295000 4/23/2005 4/30/2005 N 169
Roadname=Walters Rd Suburb=Mt Wellington Townorcity= TLA=Auckland City District=Auckland
ListPrice=295000 SalePrice=295000 ListDate=16549 ArrementDate=16556 UnconditionalDate=.
DaysOnMarket=. Bedrooms=. LandArea=. New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=163
NOTE: Invalid data for UnconditionalDate in line 194 147-147.
194 Taupata St Mt Eden Auckland Auckland City Auckland
89 540500 4/11/2005 4/30/2005 4 599 N 169
Roadname=Taupata St Suburb=Mt Eden Townorcity=Auckland TLA=Auckland City District=Auckland
ListPrice=. SalePrice=540500 ListDate=16537 ArrementDate=16556 UnconditionalDate=. DaysOnMarket=.
Bedrooms=4 LandArea=599 New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=194
NOTE: Invalid data for UnconditionalDate in line 222 155-155.
222 Putini Rd Mangere Bridge Manukau City Auckland
89 330000 330000 4/30/2005 5/31/2005 N 169
Roadname=Putini Rd Suburb=Mangere Bridge Townorcity= TLA=Manukau City District=Auckland
ListPrice=330000 SalePrice=330000 ListDate=16556 ArrementDate=16587 UnconditionalDate=.
DaysOnMarket=. Bedrooms=. LandArea=. New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=222
NOTE: Invalid data for UnconditionalDate in line 241 152-152.
241 Dempsey St Remuera Auckland Auckland City Auckland
89 422500 4/12/2005 4/30/2005 0 N 169
Roadname=Dempsey St Suburb=Remuera Townorcity=Auckland TLA=Auckland City District=Auckland
ListPrice=. SalePrice=422500 ListDate=16538 ArrementDate=16556 UnconditionalDate=. DaysOnMarket=.
Bedrooms=. LandArea=0 New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=241
NOTE: Invalid data for UnconditionalDate in line 247 147-147.
247 Whitford Rd Botany Downs Manukau City Auckland
89 465000 452500 4/30/2005 5/9/2005 5 911 N 260000 169
Roadname=Whitford Rd Suburb=Botany Downs Townorcity= TLA=Manukau City District=Auckland
ListPrice=465000 SalePrice=452500 ListDate=16556 ArrementDate=16565 UnconditionalDate=.
DaysOnMarket=. Bedrooms=5 LandArea=911 New=N Valuation=260000 ValuationYear=. _ERROR_=1 _N_=247
NOTE: Invalid data for UnconditionalDate in line 257 147-147.
257 Epsom Ave Epsom Auckland Auckland City Auckland
89 1277000 4/1/2005 4/30/2005 4 1025 N 169
Roadname=Epsom Ave Suburb=Epsom Townorcity=Auckland TLA=Auckland City District=Auckland
ListPrice=. SalePrice=1277000 ListDate=16527 ArrementDate=16556 UnconditionalDate=. DaysOnMarket=.
Bedrooms=4 LandArea=1025 New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=257
NOTE: Invalid data for UnconditionalDate in line 280 152-152.
280 Arney Cres Remuera Auckland Auckland City Auckland
89 2300000 4/30/2005 4/30/2005 0 N 169
Roadname=Arney Cres Suburb=Remuera Townorcity=Auckland TLA=Auckland City District=Auckland
ListPrice=. SalePrice=2300000 ListDate=16556 ArrementDate=16556 UnconditionalDate=. DaysOnMarket=.
Bedrooms=. LandArea=0 New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=280
NOTE: Invalid data for UnconditionalDate in line 281 155-155.
WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.
281 Merwan St Mangere Manukau City Auckland
89 270000 270000 4/22/2005 4/30/2005 N 169
Roadname=Merwan St Suburb=Mangere Townorcity= TLA=Manukau City District=Auckland ListPrice=270000
SalePrice=270000 ListDate=16548 ArrementDate=16556 UnconditionalDate=. DaysOnMarket=. Bedrooms=.
LandArea=. New=N Valuation=. ValuationYear=. _ERROR_=1 _N_=281
NOTE: 704 records were read from the infile 'H:\stats301\A2\ResidentialSalesAuckland.dat'.
The minimum record length was 169.
The maximum record length was 175.
NOTE: The data set WORK.RESIDENTIALSALESAUCKLAND has 704 observations and 16 variables.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.04 seconds


459
460 proc print data=ResidentialSalesAuckland (obs=20);
461
462 title 'residential';
463 format ListDate date9. ArrementDate date9. UnconditionalDate date9.;
464 run;

NOTE: There were 20 observations read from the data set WORK.RESIDENTIALSALESAUCKLAND.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.17 seconds
cpu time 0.01 seconds

 

 

2. And then when i put  'if ListPrice =. then delete;' function to get rid of rows with missing value. And now it gave me 0 observartion :/

 

Here is code:

data ResidentialSalesAuckland;
infile 'H:\stats301\A2\ResidentialSalesAuckland.dat';

if ListPrice =. then delete;

input Roadname $27. Suburb $28-55 Townorcity $56-65 TLA $66-80 District $81-90
ListPrice 91-99 SalePrice 100-107 @108 ListDate ArrementDate UnconditionalDate @139 DaysOnMarket 7.
Bedrooms 147 LandArea 6. New $2. Valuation 157-165 ValuationYear 4.;

Informat Listdate mmddyy10. ArrementDate mmddyy10. UnconditionalDate mmddyy10.;
run;

proc print data=ResidentialSalesAuckland (obs=20);

title 'residential';
format ListDate date9. ArrementDate date9. UnconditionalDate date9.;
run;

And the error:

 

*Question4;
467 options yearcutoff=1900 nodate;
468
469 data ResidentialSalesAuckland;
470 infile 'H:\stats301\A2\ResidentialSalesAuckland.dat';
471
472 if ListPrice =. then delete;
473
474 input Roadname $27. Suburb $28-55 Townorcity $56-65 TLA $66-80 District $81-90
475 ListPrice 91-99 SalePrice 100-107 @108 ListDate ArrementDate UnconditionalDate @139
475! DaysOnMarket 7.
476 Bedrooms 147 LandArea 6. New $2. Valuation 157-165 ValuationYear 4.;
477
478 Informat Listdate mmddyy10. ArrementDate mmddyy10. UnconditionalDate mmddyy10.;
479 run;

NOTE: The infile 'H:\stats301\A2\ResidentialSalesAuckland.dat' is:
Filename=H:\stats301\A2\ResidentialSalesAuckland.dat,
RECFM=V,LRECL=32767,File Size (bytes)=120390,
Last Modified=17 August 2017 10:29:46,
Create Time=17 August 2017 10:29:45

NOTE: DATA STEP stopped due to looping.
NOTE: 0 records were read from the infile 'H:\stats301\A2\ResidentialSalesAuckland.dat'.
NOTE: The data set WORK.RESIDENTIALSALESAUCKLAND has 0 observations and 16 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds


480
481 proc print data=ResidentialSalesAuckland (obs=20);
482
483 title 'residential';
484 format ListDate date9. ArrementDate date9. UnconditionalDate date9.;
485 run;

NOTE: No observations in data set WORK.RESIDENTIALSALESAUCKLAND.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds

 

 

 

What's the problem from the original code and What do i have to do if i want to get rid of rows with missing values??

 

I will attach file how the first output looked like

(just to make it sure, the data is not 'txt.'file!! it's dat. file but since i couldn't upload this file directly on this post, so 

i changed it into txt. )

 

 

 

Super User
Posts: 23,663

Re: deleting row with missing observation

You have to read your files correctly FIRST. Garbage in, garbage out.
Contributor
Posts: 21

Re: deleting row with missing observation

That's why i asked question...!
What's the problem? And how can i fix this?
Why they can't read 'UnconditionalDate'?

Super User
Posts: 23,663

Re: deleting row with missing observation

And like I said last time, attach your file if you want help reading it otherwise we're running around in circles.
Valued Guide
Posts: 589

Re: deleting row with missing observation

Hi,

 

I suggest you to have delimiter in your ".dat" file as "," if possible, because it has many blanks and there is an issue with reading the file if you have consecutive blanks.

If you can have delimiter as "," then you can use "DSD" to read the data if there is consecutive missing values.

Thanks,
Suryakiran
Valued Guide
Posts: 589

Re: deleting row with missing observation

Hi @glee217

 

What happening actually in your code is in row 13 you have missing value for ListDate and values for ArrementDate, UnconditionalDate. When SAS is reading the variables in INPUT its is not taking missing value for ListDate, it is taking ArrementDate value. So, here it is ignoring the missing values. That's why it's always better to have a delimiter.

If you can't have a delimiter for the ".DAT" file then read the dates as character with appropriate column pointer and then convert them to dates using INPUT function.

If you want the IF statement to work then you must have the statement after the data is in PDV (Memory). You can use double trailing ("@@") to hold the data and check for a condition and then output it.

 

See my code for your reference:

data ResidentialSalesAuckland(Drop=Listdate_ ArrementDate_ UnconditionalDate_);
infile 'E:\SAS_Sandbox\SAS Community\ResidentialSalesAuckland.txt'  ;
format Listdate mmddyy10. ArrementDate mmddyy10. UnconditionalDate mmddyy10.;
input Roadname & $27. Suburb $28-55 Townorcity $56-65 TLA $66-80 District $81-90
ListPrice 91-99 SalePrice 100-107  ListDate_ $108-118  ArrementDate_ $119-128  UnconditionalDate_ $129-138  @139 DaysOnMarket 7.
Bedrooms 147 LandArea 6. New $2. Valuation 157-165 ValuationYear 4. @@;
if MISSING(ListPrice) then delete;
ListDate=INPUT(ListDate_,mmddyy10.);
ArrementDate=INPUT(ArrementDate_,mmddyy10.);
UnconditionalDate=INPUT(UnconditionalDate_,mmddyy10.);
run;
Thanks,
Suryakiran
Contributor
Posts: 21

Re: deleting row with missing observation

Posted in reply to SuryaKiran
Hey Thank you so much!!
Trusted Advisor
Posts: 1,837

Re: deleting row with missing observation

@glee217, I see some issues in your first step:

   

data ResidentialSalesAuckland;
infile 'H:\stats301\A2\ResidentialSalesAuckland.dat';

if ListPrice =. then delete;

input Roadname $27. Suburb $28-55 Townorcity $56-65 TLA $66-80 District $81-90
ListPrice 91-99 SalePrice 100-107 @108 ListDate ArrementDate UnconditionalDate @139 DaysOnMarket 7.
Bedrooms 147 LandArea 6. New $2. Valuation 157-165 ValuationYear 4.;

Informat Listdate mmddyy10. ArrementDate mmddyy10. UnconditionalDate mmddyy10.;
run;

 

(1) Your statements are not in the right order. You are checking LISTPRICE before you read it.
     that may cose geting 0 observations as result.

     Generally I prefer, though it is not a must, to put the INFORMAT statement before reading.

     Thus the code should be:

   

data ResidentialSalesAuckland;
infile 'H:\stats301\A2\ResidentialSalesAuckland.dat';

Informat Listdate mmddyy10. ArrementDate mmddyy10. UnconditionalDate mmddyy10.;

input Roadname $27. Suburb $28-55 Townorcity $56-65 TLA $66-80 District $81-90
ListPrice 91-99 SalePrice 100-107 @108 ListDate ArrementDate UnconditionalDate @139 DaysOnMarket 7.
Bedrooms 147 LandArea 6. New $2. Valuation 157-165 ValuationYear 4.;

if ListPrice =. then delete;

run;

 

(2) You are reading 3 dates in :

     

@108 ListDate ArrementDate UnconditionalDate

    each date informat is defined as mmddyy10. , but looking at the dump of your input row, in the log, there are two dates only.

    That is the reason you have a lot of messages saying: NOTE: Invalid data for UnconditionalDate in line 

    as your input, starting at position 108, is like: 4/29/2005 5/16/2005 3 N 169   (info was taken from row 13)

 

(3) Are you sure that your input statemant fits the data format ?

   

input Roadname   $27. 
      Suburb     $28-55 
	  Townorcity $56-65 
	  TLA        $66-80 
	  District   $81-90
          ListPrice   91-99 
	  SalePrice  100-107 
	  @108 ListDate ArrementDate UnconditionalDate 
	  @139 DaysOnMarket 7.
          Bedrooms   147 
	  LandArea    6. 
	  New        $2. 
	  Valuation  157-165 
	  ValuationYear 4.
  ;

     

 

 

 

 

Super User
Posts: 13,498

Re: deleting row with missing observation

This seems to work for me:

 

filename toread 'd:\ResidentialSalesAuckland.txt'; 
data ResidentialSalesAuckland;
   infile toread;
   Informat Listdate mmddyy10. ArrementDate mmddyy10. UnconditionalDate mmddyy10.;
   format Listdate mmddyy10. ArrementDate mmddyy10. UnconditionalDate mmddyy10.;


   input 
      Roadname $27. 
      Suburb $28-55 
      Townorcity $56-65 
      TLA $66-80 
      District $81-90
      ListPrice 91-99 
      SalePrice 100-107 
      @108 ListDate 
      @118 ArrementDate 
      @128 UnconditionalDate 
      @139 DaysOnMarket 7.
      Bedrooms 147 
      LandArea 6. 
      New $2. 
      Valuation 157-165 
      ValuationYear 4.
   ;
 
run;

The filename is a style choice. The issues are coming from not providing example data earlier. It appears that your dat is fixed column and so the change to list for the date varaibles caused and issue when the dates for one of the variables was missing. So the second date value was read into the first, the third date value into the second and the next thing encountered wer the values for Bedrooms and landarea with and hence not mmddyy10. data.

 

Ask a Question
Discussion stats
  • 8 replies
  • 459 views
  • 0 likes
  • 5 in conversation