Hello,
I have to recode variables from dataset_1 where each value is represented as a number that ranges from 1 to 10 (quantile group). I have a dataset_2 with 10 rows where the variables are the same. Now I want to replace the group number in dataset_1 by the value [row_i, var_j] in dataset_2. I was considering using a two dimensional array but I'm still stuck with the problem. Does anyone have an idea on how to achieve this ? Thanks in advance !
O.D.
E.g.,
Dataset_1
height weight time
1 2 2
1 3 2
2 2 2
Dataset_2
height weight time
height<10 weight<15 time<1
10<=height<20 15<=weight<30 1<=time<5
20<=height 30<=weight 5<=time
Want
height weight time
height<10 15<=weight<30 1<=time<5
height<10 30<=weight 1<=time<5
10<=height<20 15<=weight<30 1<=time<5
Formats are often used to replace values, unfortunately dataset_2 is hardly usable as source for formats. Can you post the dataset in usable form, so that we have something to work with?
Here is a sample of variables from the format table. As you can see, the number of groups by variable can be lower than 10.
data format_table;
informat avg_trans_value $50. avg_val_day $50. avg_val_day_mcc $50. avg_val_day_pos $50. avg_vol_day $50. avg_vol_day_mcc $50.;
infile datalines delimiter='|';
input avg_trans_value avg_val_day avg_val_day_mcc avg_val_day_pos avg_vol_day avg_vol_day_mcc;
datalines;
avg_trans_value < 13 |avg_val_day < 4.83 |avg_val_day_mcc < 19.98 |avg_val_day_pos < 35.43 |avg_vol_day < 2, _MISSING_ |avg_vol_day_mcc < 1.93
13 <= avg_trans_value < 17 |4.83 <= avg_val_day < 9 |19.98 <= avg_val_day_mcc < 24.54 |35.43 <= avg_val_day_pos < 38.77, _MISSING_ |2 <= avg_vol_day < 3 |1.93 <= avg_vol_day_mcc < 2.27
17 <= avg_trans_value < 19 |9 <= avg_val_day < 10.51 |24.54 <= avg_val_day_mcc < 25.84 |38.77 <= avg_val_day_pos < 66.05 |3 <= avg_vol_day < 4 |2.27 <= avg_vol_day_mcc < 2.32
19 <= avg_trans_value < 23 |10.51 <= avg_val_day < 16.79 |25.84 <= avg_val_day_mcc < 26.52 |66.05 <= avg_val_day_pos |4 <= avg_vol_day < 6 |2.32 <= avg_vol_day_mcc < 2.48, _MISSING_
23 <= avg_trans_value < 27 |16.79 <= avg_val_day < 24.69, _MISSING_ |26.52 <= avg_val_day_mcc < 28.15 | |6 <= avg_vol_day |2.48 <= avg_vol_day_mcc < 2.65
27 <= avg_trans_value < 30 |24.69 <= avg_val_day < 37.96 |28.15 <= avg_val_day_mcc < 34.42 | | |2.65 <= avg_vol_day_mcc < 2.8
30 <= avg_trans_value < 36 |37.96 <= avg_val_day < 51.8 |34.42 <= avg_val_day_mcc < 36.4 | | |2.8 <= avg_vol_day_mcc < 2.93
36 <= avg_trans_value < 44 |51.8 <= avg_val_day < 83.42 |36.4 <= avg_val_day_mcc < 42.05 | | |2.93 <= avg_vol_day_mcc < 3.11
44 <= avg_trans_value < 78, _MISSING_ |83.42 <= avg_val_day |42.05 <= avg_val_day_mcc < 48.14 | | |3.11 <= avg_vol_day_mcc < 3.6
78 <= avg_trans_value | |48.14 <= avg_val_day_mcc, _MISSING_ | | |3.6 <= avg_vol_day_mcc
;
run;
Many of us, including me, will not download Excel (or other Microsoft Office) documents as they are a security threat. I personally will not download any attachments at all. Thus, we request that you provide data as WORKING SAS data step code (examples and instructions). Many people ignore this request — @odozzz4 please DO NOT IGNORE THIS REQUEST
Just one word for that "format table": Gaaack!
What are these supposed to mean. There are more. You have a single value in the table with, a range AND a value of _missing_? If you have a numeric value for avg_trans_value then I doubt you will see "_MISSING_" as a value for the variable and placing it in the middle of those definitions is definitely confusing.
44 <= avg_trans_value < 78, _MISSING_
16.79 <= avg_val_day < 24.69, _MISSING_
48.14 <= avg_val_day_mcc, _MISSING_
I would turn Dataset_2 into a CNTLIN data set for PROC FORMAT. Then you can use these formats to change the appearance of the values in dataset_1.
Agreeing with @andreas_lds that dataset_2 is in a poor layout, better it should be long and vertical rather than wide.
Example of creating a CNTLIN data set for PROC FORMAT:
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/proc/n1e19y6lrektafn1kj6nbvhus59w.htm
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.