Table look up

Div_sas · Posted 08-29-2018 08:43 AM

Hello all,

I am trying to calculate percentiles using two datasets, one of which is a standard dataset and the other dataset has systolic and diastolic bp values for each subject .I want to check each observation in the subject dataset against the standard dataset to find a match for the percentile:

Standard dataset:

age	percentile	val	ht
1	n1	45	50.5
1	n2	55	50.5
1	n3	65	50.5
1	n4	75	50.5
1	n5	85	50.5
1	n1	49	55.8
1	n2	54	55.8
1	n3	66	55.8
1	n4	75	55.8
1	n5	88	55.8

subject dataset example:

id	ht	val	age
xyz	52	50	1
abc	56	71	1

I want :

id	ht	val	age	percentile
xyz	52	50	1	n1
abc	56	71	1	n3

Because the height for subject xyx is 52 which is less than 55.8 from the standard table it would fall under the 50.5 category as I need to consider the lower bound values and the val for that subject is 50 which fall under 45 as it is less than 55 it would be categorized under n1. The actual data consists of many ages with varying heights. Any help would be appreciated. Thank you.

RW9 · Posted 08-29-2018 09:10 AM

Post test data in the form of a datastep!!

First up, the standard dataset, if you have an upper and lower bound, then it makes sense to keep these in one row:

age percentile val lower_ht upper_ht

1 n1 ? 50.5 55.8

1 n2 ...

That is a sensible model and will then allow you to merge onto your data based on age=age and lower_ht <= ht <= upper_ht.

The problem lies in the data modelling aspect of the "standard dataset", but you can sort this by sorting it by age, percentile, ht, then datastep first obs=lower, second=higher.

Div_sas · Posted 08-29-2018 09:27 AM

sorry about that. Here's the data in datastep:

data std;
input age percentile $ val ht;
datalines;
1 n1 45 50.5 
1 n2 55 50.5 
1 n3 65 50.5 
1 n4 75 50.5 
1 n5 85 50.5 
1 n1 49 55.8 
1 n2 54 55.8 
1 n3 66 55.8 
1 n4 75 55.8 
1 n5 88 55.8 
;
run;

data subj;
input id $ ht val age;
datalines;
xyz 52 50 1 
abc 56 71 1 
;
run;

I will need upper and lower bound values for height and val as well. Thank you!

RW9 · Posted 08-29-2018 09:53 AM

Well, it doesn't make sense, as all the ranges are the same? Maybe it is the val part I am missing? Anyways some code towards what you want:

data std;
  input age percentile $ val ht;
datalines;
1 n1 45 50.5 
1 n2 55 50.5 
1 n3 65 50.5 
1 n4 75 50.5 
1 n5 85 50.5 
1 n1 49 55.8 
1 n2 54 55.8 
1 n3 66 55.8 
1 n4 75 55.8 
1 n5 88 55.8 
;
run;

proc sort data=std;
  by age percentile ht;
run;

data std_upd (keep=age percentile lower higher);
  set std;
  by age percentile;
  retain lower higher;
  if first.percentile then lower=ht;
  if last.percentile then do;
    higher=ht;
    output;
  end;
run;
  
data subj;
  input id $ ht val age;
datalines;
xyz 52 50 1 
abc 56 71 1 
;
run;

proc sql;
  create table want as 
  select a.*,
         b.percentile,
         b.lower,
         b.higher
  from   subj a
  left join std_upd b
  on     a.age=b.age
  and    b.lower <= a.ht <= b.higher;
quit;

Table look up

Re: Table look up

Re: Table look up

Re: Table look up

Table look up

Re: Table look up

Re: Table look up

Re: Table look up

The 2025 SAS Hackathon has begun!

SAS Training: Just a Click Away