Hello,
I have to recode variables from dataset_1 where each value is represented as a number that ranges from 1 to 10 (quantile group). I have a dataset_2 with 10 rows where the variables are the same. Now I want to replace the group number in dataset_1 by the value [row_i, var_j] in dataset_2. I was considering using a two dimensional array but I'm still stuck with the problem. Does anyone have an idea on how to achieve this ? Thanks in advance !
O.D.
E.g.,
Dataset_1
height weight time
1 2 2
1 3 2
2 2 2
Dataset_2
height weight time
height<10 weight<15 time<1
10<=height<20 15<=weight<30 1<=time<5
20<=height 30<=weight 5<=time
Want
height weight time
height<10 15<=weight<30 1<=time<5
height<10 30<=weight 1<=time<5
10<=height<20 15<=weight<30 1<=time<5
Formats are often used to replace values, unfortunately dataset_2 is hardly usable as source for formats. Can you post the dataset in usable form, so that we have something to work with?
Here is a sample of variables from the format table. As you can see, the number of groups by variable can be lower than 10.
data format_table;
informat avg_trans_value $50. avg_val_day $50. avg_val_day_mcc $50. avg_val_day_pos $50. avg_vol_day $50. avg_vol_day_mcc $50.;
infile datalines delimiter='|';
input avg_trans_value avg_val_day avg_val_day_mcc avg_val_day_pos avg_vol_day avg_vol_day_mcc;
datalines;
avg_trans_value < 13 |avg_val_day < 4.83 |avg_val_day_mcc < 19.98 |avg_val_day_pos < 35.43 |avg_vol_day < 2, _MISSING_ |avg_vol_day_mcc < 1.93
13 <= avg_trans_value < 17 |4.83 <= avg_val_day < 9 |19.98 <= avg_val_day_mcc < 24.54 |35.43 <= avg_val_day_pos < 38.77, _MISSING_ |2 <= avg_vol_day < 3 |1.93 <= avg_vol_day_mcc < 2.27
17 <= avg_trans_value < 19 |9 <= avg_val_day < 10.51 |24.54 <= avg_val_day_mcc < 25.84 |38.77 <= avg_val_day_pos < 66.05 |3 <= avg_vol_day < 4 |2.27 <= avg_vol_day_mcc < 2.32
19 <= avg_trans_value < 23 |10.51 <= avg_val_day < 16.79 |25.84 <= avg_val_day_mcc < 26.52 |66.05 <= avg_val_day_pos |4 <= avg_vol_day < 6 |2.32 <= avg_vol_day_mcc < 2.48, _MISSING_
23 <= avg_trans_value < 27 |16.79 <= avg_val_day < 24.69, _MISSING_ |26.52 <= avg_val_day_mcc < 28.15 | |6 <= avg_vol_day |2.48 <= avg_vol_day_mcc < 2.65
27 <= avg_trans_value < 30 |24.69 <= avg_val_day < 37.96 |28.15 <= avg_val_day_mcc < 34.42 | | |2.65 <= avg_vol_day_mcc < 2.8
30 <= avg_trans_value < 36 |37.96 <= avg_val_day < 51.8 |34.42 <= avg_val_day_mcc < 36.4 | | |2.8 <= avg_vol_day_mcc < 2.93
36 <= avg_trans_value < 44 |51.8 <= avg_val_day < 83.42 |36.4 <= avg_val_day_mcc < 42.05 | | |2.93 <= avg_vol_day_mcc < 3.11
44 <= avg_trans_value < 78, _MISSING_ |83.42 <= avg_val_day |42.05 <= avg_val_day_mcc < 48.14 | | |3.11 <= avg_vol_day_mcc < 3.6
78 <= avg_trans_value | |48.14 <= avg_val_day_mcc, _MISSING_ | | |3.6 <= avg_vol_day_mcc
;
run;
Many of us, including me, will not download Excel (or other Microsoft Office) documents as they are a security threat. I personally will not download any attachments at all. Thus, we request that you provide data as WORKING SAS data step code (examples and instructions). Many people ignore this request — @odozzz4 please DO NOT IGNORE THIS REQUEST
Just one word for that "format table": Gaaack!
What are these supposed to mean. There are more. You have a single value in the table with, a range AND a value of _missing_? If you have a numeric value for avg_trans_value then I doubt you will see "_MISSING_" as a value for the variable and placing it in the middle of those definitions is definitely confusing.
44 <= avg_trans_value < 78, _MISSING_
16.79 <= avg_val_day < 24.69, _MISSING_
48.14 <= avg_val_day_mcc, _MISSING_
I would turn Dataset_2 into a CNTLIN data set for PROC FORMAT. Then you can use these formats to change the appearance of the values in dataset_1.
Agreeing with @andreas_lds that dataset_2 is in a poor layout, better it should be long and vertical rather than wide.
Example of creating a CNTLIN data set for PROC FORMAT:
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/proc/n1e19y6lrektafn1kj6nbvhus59w.htm
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.