BookmarkSubscribeRSS Feed
urban58
Quartz | Level 8

I have a numeric variable, BLANK, that is supposed to be missing if there is data for a particular TEST, otherwise BLANK should have a value why the test is not done. There are many times where BLANK is missing and there is no data for the test.

 

A tiny fragment of my data - there are 100s of variables

id BLANK ced abc def olp fft aaa_right fyu_right rst_right aaa_left fyu_left rst_left
1
2 2 6 2 4 1 2.855 1.615 1.91 2.085 2.195 4.11
3 7 2 3 2 1 3.87 3.075 2.035 1.265 8.425 2.28
4 2 0 0 2.76 2.535 3.13 1.65 2.31 3.515
5 1
6 1 0 1 1 1 2.05 2.99 1.655 3.425 1.97 5.015
7 1 3 2 3 1 1.265 0.825 2.43 6.015 7.945 3.29
8
9 3 7 2 6 0 1.65 2.165 6.305 3.075 2.9
;

I have 2 questions I would like appreciate help with -
1. How can I format missing for the BLANK variable - I tried the format below with no success

value tt
. =0:Test done
1 =1:Refusal
2 =2:Split exam
3 =3:xxxx
4 =4:Proxy
5 =5:Other;

 

2. To try and tease out why BLANK is missing and no test data, is it possible to combine BLANK with other variables (the test has questionnaire and physical test
components), both of which have a bunch of variables

 

missq =cmiss(of ced -- fft);
if missq=0 then QUEST_nd=0; else
if missq=(1:5) then QUEST_nd=1; else
if missq=5 then QUEST_nd=2; else;


if missp =cmiss(of aaa_right -- rst_left);
if missp =0 then PHY_nd=0; else
if missp=(1:5) then PHY_nd=1; else
if missp =6 then PHY_nd=2;


How do I combine QUEST_nd/PHY_nd and BLANK to get a variable and format that I use to find out if I have test data or not?

Thanks in advance,
Margaret

2 REPLIES 2
jimbarbour
Meteorite | Level 14

@urban58,

 

Formats are great, but an alternative might be just to get the count of how many vars are populated like the below.  You can set the threshold any way you like, but just as an example, I set it to 2.  In other words, if two or more vars are populated, then you have data.  

 

Pardon me if you already know this, but:

I'm using a range to specify the vars.  It doesn't matter how many vars there are so long as they are all contiguous and all numeric.  All I have to do is specify the first and the last with two dashes in between.  If the vars are not all contiguous, I can specify groups of contiguous vars like this:  N(of A -- B  F -- T  V -- Z);

 

Results are shown below the code.

 

Jim

 

DATA	Test_Data;
	INFILE	Datalines	MISSOVER;
	INPUT
	id BLANK ced abc def olp fft aaa_right fyu_right rst_right aaa_left fyu_left rst_left
	;
	
	IF	N(of id -- rst_left)	>	2	THEN
		I_Have_Data				=	1;
	ELSE
		I_Have_Data				=	0;

Datalines;
1
2 2 6 2 4 1 2.855 1.615 1.91 2.085 2.195 4.11
3 7 2 3 2 1 3.87 3.075 2.035 1.265 8.425 2.28
4 2 0 0 2.76 2.535 3.13 1.65 2.31 3.515
5 1
6 1 0 1 1 1 2.05 2.99 1.655 3.425 1.97 5.015
7 1 3 2 3 1 1.265 0.825 2.43 6.015 7.945 3.29
8
9 3 7 2 6 0 1.65 2.165 6.305 3.075 2.9
;
RUN;

Results:

jimbarbour_0-1601665337470.png

 

ballardw
Super User

How exactly did you "test" your format.

Admittedly I don't like skipping the quotes around value strings just in case. This will display the values as formatted:

proc format ;

value tt
. ="0:Test done"
1 ="1:Refusal"
2 ="2:Split exam"
3 ="3:xxxx"
4 ="4:Proxy"
5 ="5:Other"
;
run;

data example;
   do value=.,1 to 5 ;
   output;
   end;
run;

proc print data=example;
   var value;
   format value tt.;
run;

If use a procedure like Proc Freq looking to see how many of each there are including the formatted value for the missing you need to include the option /missing on a tables statement.

You do have to make sure the format is available in the current session as well.

 

 


@urban58 wrote:

 

2. To try and tease out why BLANK is missing and no test data, is it possible to combine BLANK with other variables (the test has questionnaire and physical test
components), both of which have a bunch of variables

 

missq =cmiss(of ced -- fft);
if missq=0 then QUEST_nd=0; else
if missq=(1:5) then QUEST_nd=1; else
if missq=5 then QUEST_nd=2; else;


if missp =cmiss(of aaa_right -- rst_left);
if missp =0 then PHY_nd=0; else
if missp=(1:5) then PHY_nd=1; else
if missp =6 then PHY_nd=2;


How do I combine QUEST_nd/PHY_nd and BLANK to get a variable and format that I use to find out if I have test data or not?

Thanks in advance,
Margaret


I think that need to provide an example with values and what you expect/want this snippet to tell you. Since the

if missq=(1:5)

throws errors, I am going to guess that you intend to use the IN operator to check if MISSQ is one of the integer values 1 through 5 as:

if missq in (1:5)

But that would be true for missg=5 so in attempting

if missq in (1:5) then QUEST_nd=1; else
if missq=5 then QUEST_nd=2  <= would never happen because it was set to 1 before the else.

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 540 views
  • 1 like
  • 3 in conversation