BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
RobertNYC
Obsidian | Level 7

Hello--

I have some tricky numeric transformations I’m having trouble figuring out. I'm finding them tricky because I have to start by basing the scores off to values in one variable. 

The data set below are the variables I start with plus the variables I need to calculate.

Start With Variables which need to be calculated
ABCDEFGHIJ
Office CertificationTotal_of_Homes ClosedCompliance_RateHome_Type_ProportionPoints_AvailableHome_Points_EarnedTotal _PointsTotal_Score
Office1 12 -month1731%99%88%8.88.89.898%
Office1 kinship 90 days2313%87%12%1.21.0

  • I need to start by calculating a Home_Type_Proportion for the values of “12-month” andkinship 90 days” so the calculations look like this: 
  • “12-month”           =  C4/(C4+C5)    =  Home_Type_Proportion
  • “kinship 90 days”  =  C5/(C4+C5)    =  Home_Type_Proportion

   

  • Then I need to calculate Points_Available:

    • “12-month”            Home_Type_Proportion   =  F4*10    = Points_Available
    • “kinship 90 days”  Home_Type_Proportion   =  F5*10    = Points_Available

   

  • Then I need to calculate Home_Points_Earned:

    • “12-month”         Points_Available  =  G4*E4  =  Home_Points_Earned
    • “kinship 90 days” Points_Available  =  G5*E5  =  Home_Points_Earned

   

  • Then I need to Combine “12-month”  and “kinship 90 days” to get:
    • Total _Points =H4+H5

  • And last  Total_Score = I4/10 or Home_Points_Earned / 10


Can you see why this may be a little tricky to figure out? Any help is greatly appreciated!  Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

@Robert: I used Total_Homes and tot_points as placeholders so that the total values of each could be accumulated during the loop.  Both are dropped by the time the file is actually written.  To understand what is actually happening through the two loops, it is easiest (I think) to add a couple of putlog _all_; statements in the code.  e.g.,

data have;

  informat office $8.;

  informat certification $20.;

  informat closed compliance_rate percent3.;

  input Office Certification & Total_of_Homes Closed Compliance_Rate;

  cards;

Office1  12-month  173  1%  99%

Office1  kinship 90 days  23  13%  87%

;

data want (drop=Total_Homes tot_points);

  do until(last.office);

    set have;

    by office;

    if first.office then Total_Homes=Total_of_Homes;

    else Total_Homes+Total_of_Homes;

    putlog _all_;

  end;

  do until(last.office);

    set have;

    by office;

    Home_Type_Proportion=Total_of_Homes/Total_Homes;

  Points_Available=Home_Type_Proportion*10;

    Home_Points_Earned=Points_Available*Compliance_Rate;

    if first.office then tot_points=Home_Points_Earned;

    else do;

      tot_points+Home_Points_Earned;

      Total_Points=tot_points;

      Total_Score=Total_Points*10;

    end;

    output;

    putlog _all_;

  end;

run;

View solution in original post

7 REPLIES 7
Cynthia_sas
SAS Super FREQ

Hi, I see your references to Row 4 and Row 5 (C4, C5), but if your column header cells are in Row 1, I only see Row 2 and Row 3. (see screenshot) So, can you explain what data is in Row 4 and Row 5???

And, what is the structure of your SAS dataset? You show an Excel worksheet -- is your data already in SAS form, or does information need to be read into or imported into SAS?

cynthia


confusion_row_nums.png
RobertNYC
Obsidian | Level 7

Hi Cynthia--

Yes, so sorry the row numbers are off they should be C2 and C3 the row with letter values were just for illustration purposes. Yes, the data is already in SAS format. I just used excel for the example.

Thanks.

art297
Opal | Level 21

If you are working with a SAS dataset, you could use a dow loop to achieve the result you want.  I didn't bother to do any of the implied rounding:

data have;

  informat office $8.;

  informat certification $20.;

  informat closed compliance_rate percent3.;

  input Office Certification & Total_of_Homes Closed Compliance_Rate;

  cards;

Office1  12-month  173  1%  99%

Office1  kinship 90 days  23  13%  87%

;

data want (drop=Total_Homes tot_points);

  do until(last.office);

    set have;

    by office;

    if first.office then Total_Homes=Total_of_Homes;

    else Total_Homes+Total_of_Homes;

  end;

  do until(last.office);

    set have;

    by office;

    Home_Type_Proportion=Total_of_Homes/Total_Homes;

    Points_Available=Home_Type_Proportion*10;

    Home_Points_Earned=Points_Available*Compliance_Rate;

    if first.office then tot_points=Home_Points_Earned;

    else do;

      tot_points+Home_Points_Earned;

      Total_Points=tot_points;

      Total_Score=Total_Points*10;

    end;

    output;

  end;

run;

RobertNYC
Obsidian | Level 7

Hi Arthur--

I am so glad you write this in Base SAS as opposed to SQL. I am in the process of trying to learn by group processing and do loops. I have a few questions. I am not to sure whats happening in the parts of the code in bold  of the code and I can't seem to see what the significance of the variable tot_points

.

data want  (drop=Total_Homes tot_points);

   do until(last.office);

    set have;

    by office;

    if first.office then Total_Homes=Total_of_Homes;

    else Total_Homes+Total_of_Homes;

  end;

  run;

  do until(last.office);

    set have;

    by office;

    Home_Type_Proportion=Total_of_Homes/Total_Homes;

    Points_Available=Home_Type_Proportion*10;

    Home_Points_Earned=Points_Available*Compliance_Rate;

    if first.office then tot_points=Home_Points_Earned;

    else do;

      tot_points+Home_Points_Earned;

      Total_Points=tot_points;

      Total_Score=Total_Points*10;

    end;

    output;

  end;

run;

THANKS!!!!!

art297
Opal | Level 21

@Robert: I used Total_Homes and tot_points as placeholders so that the total values of each could be accumulated during the loop.  Both are dropped by the time the file is actually written.  To understand what is actually happening through the two loops, it is easiest (I think) to add a couple of putlog _all_; statements in the code.  e.g.,

data have;

  informat office $8.;

  informat certification $20.;

  informat closed compliance_rate percent3.;

  input Office Certification & Total_of_Homes Closed Compliance_Rate;

  cards;

Office1  12-month  173  1%  99%

Office1  kinship 90 days  23  13%  87%

;

data want (drop=Total_Homes tot_points);

  do until(last.office);

    set have;

    by office;

    if first.office then Total_Homes=Total_of_Homes;

    else Total_Homes+Total_of_Homes;

    putlog _all_;

  end;

  do until(last.office);

    set have;

    by office;

    Home_Type_Proportion=Total_of_Homes/Total_Homes;

  Points_Available=Home_Type_Proportion*10;

    Home_Points_Earned=Points_Available*Compliance_Rate;

    if first.office then tot_points=Home_Points_Earned;

    else do;

      tot_points+Home_Points_Earned;

      Total_Points=tot_points;

      Total_Score=Total_Points*10;

    end;

    output;

    putlog _all_;

  end;

run;

Reeza
Super User

Here's a two step SQL solution.

You could probably do it in one but I didn't feel like working out the math Smiley Happy

data have;

    input office certification $ total_of_homes close compliance_rate;

cards;

1 12M     173 .01 .99

1 Kinship 23  .13 .87

;

proc sql;

    create table step1 as

    select *, sum(total_of_homes) as n_homes format=8.,

        total_of_homes/calculated n_homes as home_type_proportion format=percent8.,

        calculated home_type_proportion*10 as points_available format=8.1,

        calculated points_available*compliance_rate as home_points_earned format=8.1

    from have

    group by office;

    create table step2 as

    select *, sum(home_points_earned) as total_points format=8.1,

            sum(home_points_earned)/10 as total_score format=percent8.

        from step1

group by office;

quit;

PGStats
Opal | Level 21

Assuming you want this calculation for each Office and the two Certifications are present for each office This SQL query will do the trick :

data have;
length certification $20;
input office $ Certification $ Total_of_Homes Closed :percent3. Compliance_Rate :percent3.;
datalines;
Office1 12-month 173 1% 99%
Office1 kinship_90_days 23 13% 87%
;


proc sql;
create table want as
select office, Certification, Total_of_Homes, Closed, Compliance_Rate,
     Total_of_Homes/sum(Total_of_Homes) as Home_Type_Proportion format=percentn5.0,
     10 * calculated Home_Type_Proportion as Points_Available format=4.1,
     Compliance_Rate * calculated Points_Available as Home_Points_Earned format=4.1,
     10 * sum(Total_of_Homes*Compliance_Rate) / sum(Total_of_Homes) as Total_Points format=4.1,
     calculated Total_Points/10 as Total_Score  format=percentn5.0
from have
group by office;


select * from want;
quit;

PG

PG

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 829 views
  • 0 likes
  • 5 in conversation