BookmarkSubscribeRSS Feed
J_Moose
Calcite | Level 5

Hello,

 

I am stuck on step 3 of the following problem on a practice exam:

 

    • Step 1:
      • create a temporary data set, cleandata36.
      • In this data set, convert all group values to upper case.
      • Then keep only observations with group equal to 'A' or 'B'.
    • Step 2:
      • Determine the MEDIAN value for the Kilograms variable for each group (A,B) in the cleandata36 data set. Round MEDIAN to the nearest whole number.
    • Step 3:
      • create results.output36 from cleandata36
      • Ensure that all values for variable Kilograms are between 40 and 200, inclusively.
      • If the value is missing or out of range, replace the value with the MEDIAN Kilograms value for the respective group (A,B) calculated in step 2.

I was able to solve the first two steps, however, I am unable to solve the third based on the medians I have found. This is the code I have thus far:

 

Data cleandata36;
set cert.input36;

Group=upcase(Group);

if Group='A' or Group='B' then output;

Run;

Proc means data= cleandata36 MEDIAN maxdec=0;
class group;
var Kilograms;

Run;

Data results.output36;
set cleandata36;

 

Any help is appreciated, thank you!

6 REPLIES 6
Jagadishkatam
Amethyst | Level 16

please try the proc stdsize procedure, here the missing values within the group will be replaced with the method specified. Since you have the method as median, the missing values within the group will be replaced with the median of that group as we used by statement as group. The reponly option does help in replacing only the missing values with median. A very good procedure perfect for your requirement and it helps you avoid the step2.

 

PS it is an untested code as i am not sure of your, so please test and let me know if it is working. Per your response I am assuming that we need to keep only kilograms between 40 and 200 inclusive and then get the median value and then replace the missing values with median.

 

data nonmissing missing;
set cleandata36;
if 40<=kilograms<=200 then output nonmissing;
if missing(kilograms) then output missing;
run;

data cleandata36_2;
set nonmissing missing;
run;

proc sort data=cleandata36_2;
by group;
run;

proc stdize data=cleandata36_2 out=results.output36 method=median reponly;
var kilograms;
by group;
run;

Thanks,
Jag
himself
Pyrite | Level 9
Hi @Jagadishkatam, i have never come across proc stdize, Thank for sharing
PaigeMiller
Diamond | Level 26

Excellent suggestion from @Jagadishkatam .  It doesn't get much simpler!

--
Paige Miller
ed_sas_member
Meteorite | Level 14

Hi @J_Moose 

I think you could add an output statement in the PROC MEANS as follows:
 
proc means data=cleandata36 median maxdec=0;
var Kilograms;
class group;
ways 1;
output out=cleandata36_median (drop=_:) median=median; /*create a dataset that contains the median for each group*/
run;
Then you could merge your two datasets as follows to retrieve the value of the median for each observation according to its group, and then be able to input its value to "kilograms" when the condition is met.
 
proc sort data=cleandata36; by group; run;

data results.output36;
merge cleandata36 cleandata36_median;
by group;
if Kilograms < 40 or Kilograms > 200 then Kilograms = median;
run;
Another possibility would be to hardcode the value of the median:
if Kilograms < 40 or Kilograms > 200 AND group="A" then Kilograms=xxx;
else Kilograms < 40 or Kilograms > 200 AND group="B" then Kilograms=yyy;
 
But it is definitely a bad habit, and makes your code less "dynamic".
 
Best,
Akash7210
Calcite | Level 5

data result.output36;
set cleandata36;
if group='A' then do;
if kilograms LT 40 or kilograms GT 200 then kilograms="79";
end;
else if group='B' then do;
if kilograms LT 40 or kilograms GT 200 then kilograms='89';
end;
run;
proc print data=result.output36;
run;

simhayagnam
Calcite | Level 5

here is the code may helpful

data work.cleandata36;

set cert.input36;

group=upcase(group);

if group in ('A','B');

run;

 

proc means data=work.cleandata36 median;

class group; var kilograms;

run;

 

data results.output36;

set cleandata36;

if Kilograms < 40 or Kilograms > 200 then do;

if group='A' then kilograms=79;

else kilograms=89;

end;

run;

 

proc contents data=results.output36; run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 3174 views
  • 4 likes
  • 7 in conversation