BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Sandeep77
Lapis Lazuli | Level 10

Hi all,

I am trying to find the missing information using case when statement. But when there are more than one missing information, it shows only the first missing information. For e.g. if  name and date of birth is missing, it is showing only the name. I want it to show all the missing information like if name, dob, address, postcode is missing then it should show all the four in the output. Is it possible or should I do each information separately?

Here is the sample dataset:

Data Missing_info;
infile cards expandtabs;
input debt_code ad_title $ ad_inits $ ad_name $ ad_dob ad_address1 $ ad_address2 $ ad_address3 $ ad_address4 $ ad_address5 $ ad_postcode $ ad_type $
;
datalines ;
431982404 Mr Michael Esposito . 2St.AubynsCottages ColeLaneOckley Dorking Surrey . RH55SX DB
426336236 Miss Joanne O'brien . DeerePlace Cardiff . . . CF54NL DB
418126603 Mrs Sarah Weston . 1FalaVillage . . . . EH375SY DB
433349933 . Hmp Park . Gloucestershire . . . . . DB
422054429 Mrs . McCormick . 52FoxStreet Stockport Cheshire . . SK39JY DB
;
run;

Proc sql;
create table Missing_info as 
select *,
case when  missing(ad_inits) then 'Missing_firstname'
when  missing(ad_name) then 'Missing_Surname'
when  missing(ad_DOB) then 'Missing_Dateofbirth'
when  missing('ad_address##1'n) then 'Missing_Address_line1'
when  missing('ad_address##2'n) then 'Missing_Address_line2'
when  missing('ad_address##3'n) then 'Missing_Address_line3'
when  missing(ad_postcode) then 'Missing_PostCode'
else 'Not_Missing'
end as Missing_status
from Full_ad_details;
quit;
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

No.  But CATX() can.

create table Missing_info as 
select *
     , catx(' '
       , case when missing(ad_inits) then 'Missing_firstname' else ' ' end
       , case when missing(ad_name) then 'Missing_Surname' else ' ' end
       , case when missing(ad_DOB) then 'Missing_Dateofbirth' else ' ' end
       , case when missing('ad_address##1'n) then 'Missing_Address_line1' else ' ' end
       , case when missing('ad_address##2'n) then 'Missing_Address_line2' else ' ' end
       , case when missing('ad_address##3'n) then 'Missing_Address_line3' else ' ' end
       , case when missing(ad_postcode) then 'Missing_PostCode' else ' ' end
       , case when 0=cmiss(ad_inits,ad_name,ad_DOB,ad_postcode,'ad_address##1'n,'ad_address##2'n,'ad_address##3'n)
              then 'Not_Missing' else ' ' end
       ) as Missing_status
from Full_ad_details
;
quit;

View solution in original post

3 REPLIES 3
AMSAS
SAS Super FREQ

Not sure if you can do it from SQL but here's some data step that handles it using arrays:

/* Create Sample Data */
Data have;
infile cards expandtabs;
input debt_code ad_title $ ad_inits $ ad_name $ ad_dob ad_address1 $ ad_address2 $ ad_address3 $ ad_address4 $ ad_address5 $ ad_postcode $ ad_type $
;
datalines ;
431982404 Mr Michael Esposito . 2St.AubynsCottages ColeLaneOckley Dorking Surrey . RH55SX DB
426336236 Miss Joanne O'brien . DeerePlace Cardiff . . . CF54NL DB
418126603 Mrs Sarah Weston . 1FalaVillage . . . . EH375SY DB
433349933 . Hmp Park . Gloucestershire . . . . . DB
422054429 Mrs . McCormick . 52FoxStreet Stockport Cheshire . . SK39JY DB
;
run;

/* Debugging option */
options symbolgen ;

/* Set the lenght of the error msssage variable */
/* Do this to handle when the addative error messages exceed the length of the error message variable */
%let errorMsgLength=50 ;

data want ;
	/* assing length of errormsg variable */
	length 
		errormsg $&errorMsgLength ;
	set have ;
	/* create array to hold error messages */
	array msg(8) $ ("Initials" "Surname" "Address1" "Address2" "Address3" "Address4" "Address5" "PostCode") ;
	/* create arrary to hold character input variables that you want to check */
	/* note you will need to do something similar for numeric variables */
	array values(8) $ ad_inits ad_name ad_address1-ad_address5 ad_postcode ;
	/* Assign iniital value to errorMsg */
	errormsg="Missing" ;
	/* Flag to capture when more errors are added to errorMsg than it can handle */
	additionalErrorsFlag="N" ;
	/* text to insert into errorMsg when it exceeds the maximum it can handle */
	exceedErrorMsg="*** ERRORS EXCEEDED ***" ;
	/* Loop through array */
	do i=1 to dim(values) ;
		/* Check if element is empty */
		if values(i)="" then do ;
			/* Check if adding an error msg will exceeed the lenght of errorMsg */
			if length(errormsg)+length(msg(i))>&errorMsgLength then do ;
				/* bump i so we drop out of the loop early */
				/* no point in checking additional fields as the errorMsg is now full */
				i=dim(msg) ;
				/* set the additional errors flag */
				additionalErrorsFlag="Y" ;
				/* overwrite the end of the error message with a note indicating additional errors not listed */
				substr(errormsg,&errorMsgLength-length(exceedErrorMsg)) = exceedErrorMsg ;
			end ;
			/* if errorMsg has room for addtional error then add it to the end */
			else do ;
				errormsg=cats(errormsg,"-",msg(i)) ;
			end ;
		end ;
	end ;
run ;
PaigeMiller
Diamond | Level 26

In this simplified example, I only check to see if two variables are missing. Since you have more than 2 variables, you can modify the SQL to handle all of your variables.

 

proc sql;
create table Missing_info as 
select *,
case when missing(ad_address4) then 'Address4' else ' ' end as missing_address4,
case when missing(ad_inits) then 'Firstname' else ' ' end as missing_inits,
catx(' ',calculated missing_inits, calculated missing_address4) as which_missing length=200
from Full_ad_details;
quit;

 

 

I question why such a text string is needed, as it seems as if there would be easier ways to account for which variables are missing.

--
Paige Miller
Tom
Super User Tom
Super User

No.  But CATX() can.

create table Missing_info as 
select *
     , catx(' '
       , case when missing(ad_inits) then 'Missing_firstname' else ' ' end
       , case when missing(ad_name) then 'Missing_Surname' else ' ' end
       , case when missing(ad_DOB) then 'Missing_Dateofbirth' else ' ' end
       , case when missing('ad_address##1'n) then 'Missing_Address_line1' else ' ' end
       , case when missing('ad_address##2'n) then 'Missing_Address_line2' else ' ' end
       , case when missing('ad_address##3'n) then 'Missing_Address_line3' else ' ' end
       , case when missing(ad_postcode) then 'Missing_PostCode' else ' ' end
       , case when 0=cmiss(ad_inits,ad_name,ad_DOB,ad_postcode,'ad_address##1'n,'ad_address##2'n,'ad_address##3'n)
              then 'Not_Missing' else ' ' end
       ) as Missing_status
from Full_ad_details
;
quit;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 459 views
  • 2 likes
  • 4 in conversation