BookmarkSubscribeRSS Feed
odozzz4
Calcite | Level 5

Hello,

I have to recode variables from dataset_1 where each value is represented as a number that ranges from 1 to 10 (quantile group). I have a dataset_2 with 10 rows where the variables are the same. Now I want to replace the group number in dataset_1 by the value [row_i, var_j] in dataset_2. I was considering using a two dimensional array but I'm still stuck with the problem. Does anyone have an idea on how to achieve this ? Thanks in advance !

 

O.D.


E.g.,
Dataset_1
height weight time
1 2 2
1 3 2
2 2 2

Dataset_2
height weight time
height<10 weight<15 time<1

10<=height<20 15<=weight<30 1<=time<5

20<=height 30<=weight 5<=time

 

Want

height weight time

height<10 15<=weight<30 1<=time<5

height<10 30<=weight 1<=time<5

10<=height<20 15<=weight<30 1<=time<5

5 REPLIES 5
andreas_lds
Jade | Level 19

Formats are often used to replace values, unfortunately dataset_2 is hardly usable as source for formats. Can you post the dataset in usable form, so that we have something to work with?

odozzz4
Calcite | Level 5

Here is a sample of variables from the format table. As you can see, the number of groups by variable can be lower than 10.

data format_table;
informat avg_trans_value $50. avg_val_day $50. avg_val_day_mcc $50. avg_val_day_pos $50. avg_vol_day $50. avg_vol_day_mcc $50.;
infile datalines delimiter='|';
input avg_trans_value avg_val_day avg_val_day_mcc avg_val_day_pos avg_vol_day avg_vol_day_mcc;
datalines;
avg_trans_value < 13	|avg_val_day < 4.83	|avg_val_day_mcc < 19.98	|avg_val_day_pos < 35.43	|avg_vol_day < 2, _MISSING_	|avg_vol_day_mcc < 1.93
13 <= avg_trans_value < 17	|4.83 <= avg_val_day < 9	|19.98 <= avg_val_day_mcc < 24.54	|35.43 <= avg_val_day_pos < 38.77, _MISSING_	|2 <= avg_vol_day < 3	|1.93 <= avg_vol_day_mcc < 2.27
17 <= avg_trans_value < 19	|9 <= avg_val_day < 10.51	|24.54 <= avg_val_day_mcc < 25.84	|38.77 <= avg_val_day_pos < 66.05	|3 <= avg_vol_day < 4	|2.27 <= avg_vol_day_mcc < 2.32
19 <= avg_trans_value < 23	|10.51 <= avg_val_day < 16.79	|25.84 <= avg_val_day_mcc < 26.52	|66.05 <= avg_val_day_pos	|4 <= avg_vol_day < 6	|2.32 <= avg_vol_day_mcc < 2.48, _MISSING_
23 <= avg_trans_value < 27	|16.79 <= avg_val_day < 24.69, _MISSING_	|26.52 <= avg_val_day_mcc < 28.15	|		|6 <= avg_vol_day	|2.48 <= avg_vol_day_mcc < 2.65
27 <= avg_trans_value < 30	|24.69 <= avg_val_day < 37.96	|28.15 <= avg_val_day_mcc < 34.42	|		|		|2.65 <= avg_vol_day_mcc < 2.8
30 <= avg_trans_value < 36	|37.96 <= avg_val_day < 51.8	|34.42 <= avg_val_day_mcc < 36.4	|		|		|2.8 <= avg_vol_day_mcc < 2.93
36 <= avg_trans_value < 44	|51.8 <= avg_val_day < 83.42	|36.4 <= avg_val_day_mcc < 42.05	|		|		|2.93 <= avg_vol_day_mcc < 3.11
44 <= avg_trans_value < 78, _MISSING_	|83.42 <= avg_val_day	|42.05 <= avg_val_day_mcc < 48.14	|		|		|3.11 <= avg_vol_day_mcc < 3.6
78 <= avg_trans_value	|		|48.14 <= avg_val_day_mcc, _MISSING_	|		|		|3.6 <= avg_vol_day_mcc
;
run;
PaigeMiller
Diamond | Level 26

Many of us, including me, will not download Excel (or other Microsoft Office) documents as they are a security threat. I personally will not download any attachments at all. Thus, we request that you provide data as WORKING SAS data step code (examples and instructions). Many people ignore this request — @odozzz4 please DO NOT IGNORE THIS REQUEST

--
Paige Miller
ballardw
Super User

Just one word for that "format table": Gaaack!

 

What are these supposed to mean. There are more. You have a single value in the table with, a range AND a value of _missing_? If you have a numeric value for avg_trans_value then I doubt you will see "_MISSING_" as a value for the variable and placing it in the middle of those definitions is definitely confusing.

44 <= avg_trans_value < 78, _MISSING_

16.79 <= avg_val_day < 24.69, _MISSING_

48.14 <= avg_val_day_mcc, _MISSING_

PaigeMiller
Diamond | Level 26

I would turn Dataset_2 into a CNTLIN data set for PROC FORMAT. Then you can use these formats to change the appearance of the values in dataset_1.

 

Agreeing with @andreas_lds that dataset_2 is in a poor layout, better it should be long and vertical rather than wide.

 

Example of creating a CNTLIN data set for PROC FORMAT: 

https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/proc/n1e19y6lrektafn1kj6nbvhus59w.htm

--
Paige Miller

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1247 views
  • 1 like
  • 4 in conversation