BookmarkSubscribeRSS Feed
molla
Fluorite | Level 6

I have the following data
code price share
------------------
AMGN 67.66 100
DELL 24.60 200
GE 34.50 100
HPQ 32.32 120
IBM 82.25 50
MOT 30.24 100

I need to calculate the average of the price and share in data step without using procedures

AMGN 67.66 100
DELL 24.60 200
GE 34.50 100
HPQ 32.32 120
IBM 82.25 50
MOT 30.24 100
45.26 111.6

8 REPLIES 8
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Is this another "interview" question of which we are seeing several recently?  If so its ridiculous, perhaps you should not use your fingers to type in, or use words to communicate.  

As for datastep:

data want;
  set have end=last;
  retain cnt tot_price tot_share;
  cnt=sum(cnt,1);
  tot_price=sum(tot_price,price);
  tot_share=sum(tot_share,share);
  if last then do;
    mean_price=tot_price/cnt;
    mean_share=tot_share/cnt;
    output;
  end;
run;

However, maybe you would be better not using SAS at all, get yourself an assembler and do it that way?

PeterClemmensen
Tourmaline | Level 20

Like this? 🙂

 

data have;
input code $ price share;
infile datalines;
datalines;
AMGN 67.66 100
DELL 24.60 200
GE 34.50 100
HPQ 32.32 120
IBM 82.25 50
MOT 30.24 100
;

data want;
   set have end = eof;
   cumprice = cumprice + price;
   cumshare = cumshare + share;

   if eof = 1 then do;
      meanprice = cumprice / _N_;
      meanshare = cumshare / _N_;
   end;

   retain cumprice cumshare 0;
   drop cumprice cumshare;
run;
Shmuel
Garnet | Level 18

data _NULL_;

 set have end=eof; 

      retain tot1 tot2 count;

      tot1+price;

      tot2+share;

      count+;

    if eof then do;

        avg_price  = tot1/count;

       avg_share = tot2/count;

       put avg_price= avg_share=;

    end;

   format avg_price avg_share 5.2;

run;

Rick_SAS
SAS Super FREQ

If this is an interview question, I would tell the person that the code should handle missing values. That would separate the wheat from the chaff.

rogerjdeangelis
Barite | Level 11
Calculating the mean with and without missing values

You can cut and paste the R code into IML/R or
you can just use IML.

HAVE
====

Up to 40 obs from sd1.have total obs=6

Obs    PRICE    SHARE

 1       1        1
 2       2        3
 3       3        .
 4       3        .
 5       2        1
 6       1        3

WANT
====

  AVG_PRICE=2.00
  AVG_SHARE=2.00

BUT I GET
=========

  AVG_PRICE=2.00
  AVG_SHARE=.

WORKING CODE FOR ADJUSTING FOR MISSINGS
=======================================

   R
   colMeans(have, na.rm=TRUE);  (adjust for missing)
   colMeans(have, na.rm=FALSE); (do not adjust for missing)

  SAS/WPS
   tot2=sum(tot2,share);
   cnt2=ifn(share ne .,cnt2+1,cnt2);

FULL SOLUTION
=============

* create some data;
options validvarname=upcase;
libname sd1 "d:/sd1";
data sd1.have(drop=code);
input price share;
cards4;
1 1
2 3
3 .
3 .
2 1
1 3
;;;;
run;quit;

* DO NOT ADJUST FOR MISSING;
data want;
   set sd1.have end = eof;
   cumprice = cumprice + price;
   cumshare = cumshare + share;

   if eof = 1 then do;
      meanprice = cumprice / _N_;
      meanshare = cumshare / _N_;
   end;

   retain cumprice cumshare 0;
   drop cumprice cumshare;
run;

* ADJUST FOR MISSING;
%utl_submit_wps64('
libname sd1 "d:/sd1";
data _NULL_;
 set sd1.have end=eof;
      retain tot1 tot2 cnt1 cnt2 0;
      tot1=sum(tot1,price);
      tot2=sum(tot2,share);
      cnt1=ifn(price ne .,cnt1+1,cnt1);
      cnt2=ifn(share ne .,cnt2+1,cnt2);
    if eof then do;
        avg_price  = tot1/cnt1;
        avg_share = tot2/cnt2;
        put avg_price= avg_share=;
    end;
   format avg_price avg_share 5.2;
run;
');

* R SOLUTION;
%utl_submit_wps64('
  options set=R_HOME "C:/Program Files/R/R-3.3.1";
  proc r;
  submit;
  library(haven);
  have=read_sas("d:/sd1/have.sas7bdat");
  have;
  colMeans(have, na.rm=TRUE);
  colMeans(have, na.rm=FALSE);
  endsubmit;
');

snoopy369
Barite | Level 11

If I didn't want the job very badly I'd give them something like this...

%macro get_means(data=,var=, class=, classval=);
  proc means data=&data.;
    %if not(%sysevalf(%superq(class)=,boolean)) %then %do;
      class &class.;
      ways 1;
    %end;
    var &var.;
    output out=_temp_means mean(&var.)=;
  run;

  data _null_;
    set _temp_means;
    %if not(%sysevalf(%superq(class)=,boolean)) %then %do;
      where &class. = &classval.;
    %end;
    call symputx("mean_&var.",&var.,'g');
  run;

  proc delete data=_temp_means;
  run;
%mend get_means;

data want;
  set sashelp.class;
  rc = dosubl('%get_means(data=sashelp.class,var=height,class=sex,classval="'||sex||'")');
  height_mean = symget('mean_height');
run;
Rick_SAS
SAS Super FREQ

@rogerjdeangelis's code is close, but doesn't handle the boundary case. 

 

rogerjdeangelis
Barite | Level 11
Good Point Rosk

Missings are the bane of programming. However  my R code does cover this case.

I suspect this is the case and I need to chck for 0/0 in the SAS code.


# A tibble: 6 × 2
  PRICE SHARE
  <dbl> <dbl>
1     1    NA
2     2    NA
3     3    NA
4     3    NA
5     2    NA
6     1    NA

PRICE SHARE
    2   NaN          ==> 0/0

PRICE SHARE
    2    NA

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1335 views
  • 2 likes
  • 7 in conversation