BookmarkSubscribeRSS Feed
odozzz4
Calcite | Level 5

Hello,

I have to recode variables from dataset_1 where each value is represented as a number that ranges from 1 to 10 (quantile group). I have a dataset_2 with 10 rows where the variables are the same. Now I want to replace the group number in dataset_1 by the value [row_i, var_j] in dataset_2. I was considering using a two dimensional array but I'm still stuck with the problem. Does anyone have an idea on how to achieve this ? Thanks in advance !

 

O.D.


E.g.,
Dataset_1
height weight time
1 2 2
1 3 2
2 2 2

Dataset_2
height weight time
height<10 weight<15 time<1

10<=height<20 15<=weight<30 1<=time<5

20<=height 30<=weight 5<=time

 

Want

height weight time

height<10 15<=weight<30 1<=time<5

height<10 30<=weight 1<=time<5

10<=height<20 15<=weight<30 1<=time<5

5 REPLIES 5
andreas_lds
Jade | Level 19

Formats are often used to replace values, unfortunately dataset_2 is hardly usable as source for formats. Can you post the dataset in usable form, so that we have something to work with?

odozzz4
Calcite | Level 5

Here is a sample of variables from the format table. As you can see, the number of groups by variable can be lower than 10.

data format_table;
informat avg_trans_value $50. avg_val_day $50. avg_val_day_mcc $50. avg_val_day_pos $50. avg_vol_day $50. avg_vol_day_mcc $50.;
infile datalines delimiter='|';
input avg_trans_value avg_val_day avg_val_day_mcc avg_val_day_pos avg_vol_day avg_vol_day_mcc;
datalines;
avg_trans_value < 13	|avg_val_day < 4.83	|avg_val_day_mcc < 19.98	|avg_val_day_pos < 35.43	|avg_vol_day < 2, _MISSING_	|avg_vol_day_mcc < 1.93
13 <= avg_trans_value < 17	|4.83 <= avg_val_day < 9	|19.98 <= avg_val_day_mcc < 24.54	|35.43 <= avg_val_day_pos < 38.77, _MISSING_	|2 <= avg_vol_day < 3	|1.93 <= avg_vol_day_mcc < 2.27
17 <= avg_trans_value < 19	|9 <= avg_val_day < 10.51	|24.54 <= avg_val_day_mcc < 25.84	|38.77 <= avg_val_day_pos < 66.05	|3 <= avg_vol_day < 4	|2.27 <= avg_vol_day_mcc < 2.32
19 <= avg_trans_value < 23	|10.51 <= avg_val_day < 16.79	|25.84 <= avg_val_day_mcc < 26.52	|66.05 <= avg_val_day_pos	|4 <= avg_vol_day < 6	|2.32 <= avg_vol_day_mcc < 2.48, _MISSING_
23 <= avg_trans_value < 27	|16.79 <= avg_val_day < 24.69, _MISSING_	|26.52 <= avg_val_day_mcc < 28.15	|		|6 <= avg_vol_day	|2.48 <= avg_vol_day_mcc < 2.65
27 <= avg_trans_value < 30	|24.69 <= avg_val_day < 37.96	|28.15 <= avg_val_day_mcc < 34.42	|		|		|2.65 <= avg_vol_day_mcc < 2.8
30 <= avg_trans_value < 36	|37.96 <= avg_val_day < 51.8	|34.42 <= avg_val_day_mcc < 36.4	|		|		|2.8 <= avg_vol_day_mcc < 2.93
36 <= avg_trans_value < 44	|51.8 <= avg_val_day < 83.42	|36.4 <= avg_val_day_mcc < 42.05	|		|		|2.93 <= avg_vol_day_mcc < 3.11
44 <= avg_trans_value < 78, _MISSING_	|83.42 <= avg_val_day	|42.05 <= avg_val_day_mcc < 48.14	|		|		|3.11 <= avg_vol_day_mcc < 3.6
78 <= avg_trans_value	|		|48.14 <= avg_val_day_mcc, _MISSING_	|		|		|3.6 <= avg_vol_day_mcc
;
run;
PaigeMiller
Diamond | Level 26

Many of us, including me, will not download Excel (or other Microsoft Office) documents as they are a security threat. I personally will not download any attachments at all. Thus, we request that you provide data as WORKING SAS data step code (examples and instructions). Many people ignore this request — @odozzz4 please DO NOT IGNORE THIS REQUEST

--
Paige Miller
ballardw
Super User

Just one word for that "format table": Gaaack!

 

What are these supposed to mean. There are more. You have a single value in the table with, a range AND a value of _missing_? If you have a numeric value for avg_trans_value then I doubt you will see "_MISSING_" as a value for the variable and placing it in the middle of those definitions is definitely confusing.

44 <= avg_trans_value < 78, _MISSING_

16.79 <= avg_val_day < 24.69, _MISSING_

48.14 <= avg_val_day_mcc, _MISSING_

PaigeMiller
Diamond | Level 26

I would turn Dataset_2 into a CNTLIN data set for PROC FORMAT. Then you can use these formats to change the appearance of the values in dataset_1.

 

Agreeing with @andreas_lds that dataset_2 is in a poor layout, better it should be long and vertical rather than wide.

 

Example of creating a CNTLIN data set for PROC FORMAT: 

https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/proc/n1e19y6lrektafn1kj6nbvhus59w.htm

--
Paige Miller

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 626 views
  • 1 like
  • 4 in conversation