SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
David_Billa
Rhodochrosite | Level 12

Assume I'm reading a .csv file to create one dataset. It has variables like ID,Sex and Age. Now I want to perform checks like ID cannot be missing, Sex should not be other than 'M' or 'F' and Age value should not be character. In case if the CSV file has data which doesn't comply with the above rules then I've to capture that record into another dataset.

 

Assume I've the CSV file like below.

 

IDSexAge
1M21
.F         23
        23MH

 

In the above example, second and third record doesn't comply with the above said rules, so I want to capture those records in WANT dataset as follows. Data in faulty record should be seperated by spaces.

 

Err_MessageFaulty_Record
ID variable cannot be missing. F 23
Age variable cannot be character23 M H

 

Can anyone provide me some guidance on this exercise?

3 REPLIES 3
maguiremq
SAS Super FREQ

Are you looking for a data set or a report? I'm pretty sure that SAS would import the age column as a character variable if it detects an "H" in it, so your second error message wouldn't occur. I imported a .csv with the same data and that's what happened--SAS imported the field as character.

 

Other than that, you can do this?

 

data have;
input id $ sex $ age $;
datalines;
1 M 21
. F 23
23 M H
;
run;

data want;
set have;
if id = " " then do;
	err_message = "ID Variable cannot be missing";
end;
run;

proc print data=want;
where err_message ^= " ";
run;

 This obviously provides both a data set and a report.

andreas_lds
Jade | Level 19

What should happen if a record has more than one error? The following code writes one obs for each error:

 

filename bob "%sysfunc(pathname(work))\test.csv";

data _null_;
   file bob;
   infile datalines;
   input;
   put _infile_;
   datalines;
ID,Sex,Age
1,M,21
.,F,23
23,M,H
;
run;


data work.cleandata(keep=Id Sex Age)
   work.errors(keep=Err_Message Faulty_Record)
;
   length 
      Id 8 Sex $ 1 Age 8
      Err_Message $ 100
      Faulty_Record $ 250
   ;

   infile bob firstobs=2;
   input;

   Id = input(scan(_infile_, 1, ','), ?? best.);
   Sex = scan(_infile_, 2, ',');
   Age = input(scan(_infile_, 3, ','), ?? best.);

   if missing(Id) then do;
      Faulty_Record = _infile_;
      Err_Message = 'ID variable cannot be missing';
      output work.errors; 
   end;

   if Sex not in ('M', 'F') then do;
      
   end;

   if missing(Age) then do;
      Faulty_Record = _infile_;
      Err_Message = 'Age is not numeric';
      output work.errors;
   end;

   if missing(Faulty_Record) then do;
      output work.cleandata;
   end;
run;
ballardw
Super User

I might suggest incorporating the line number of the file read into the error record data set as well. Usually the automatic variable _n_ helps provide that.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 798 views
  • 2 likes
  • 4 in conversation