Hi I am very new to SAS and coding. I have a situation I have two datasets as below
1) Dataset 1:(with 86 variables and 7095 observation)
Var 1 | var 2 | WEEK | var 4 | STORE | var 5 |
a | f | 1 | 0 | 1 | x |
b | k | 6 | 0 | 53 | x |
c | z | 66 | 0 | 2 | x |
d | x | 50 | 0 | 53 | x |
e | y | 3 | 0 | 8 | x |
Here WEEK column above have values from week 1 to week 100
2) Dataset 2:(with 2 variables and 621 unique observation)
STORE | TIER |
1 | Tier A |
2 | Tier C |
3 | Tier B |
8 | Tier B |
9 | Tier C |
53 | Tier A |
18 | Tier C |
98 | Tier B |
Here Tier column has values as Tier A,B,C for all 621 unique store numbers
I want to add a new column in dataset 1 as store type with below logic:
Tier A | Tier B | Tier C | |
Pre-week (week 1 to Week 50) | 11 | 12 | 13 |
Post-week (week 51 to Week 100) | 21 | 22 | 23 |
So the new column will have numerically encoded values as 11,12,13,21,22,23 based on range of week from WEEK column in dataset 1 and mapped Tier from dataset 2.
Kindly help me with some sort of function code for this.
Hello,
Please show what the want dataset should look like. Does the following program
answer your question ?
data have;
input var1 $ var2 $ WEEK var4 STORE var5 $;
cards;
a f 1 0 1 x
b k 6 0 53 x
c z 66 0 2 x
d x 50 0 53 x
e y 3 0 8 x
;
run;
data tiers;
infile cards dlm=',' dsd;
input STORE TIER $;
cards;
1,Tier A
2,Tier C
3,Tier B
8,Tier B
9,Tier C
53,Tier A
18,Tier C
98,Tier B
;
run;
proc sort data=have; by STORE; run;
proc sort data=tiers; by STORE; run;
data want;
merge have (in=inhave) tiers;
by STORE;
if WEEK le 50 then newvar=10;
else newvar=20;
if upcase(TIER)="TIER A" then newvar=newvar+1;
else if upcase(TIER)="TIER B" then newvar=newvar+2;
else newvar=newvar+3;
if inhave;
run;
proc sort data=want; by var1; run;
Hello,
Please show what the want dataset should look like. Does the following program
answer your question ?
data have;
input var1 $ var2 $ WEEK var4 STORE var5 $;
cards;
a f 1 0 1 x
b k 6 0 53 x
c z 66 0 2 x
d x 50 0 53 x
e y 3 0 8 x
;
run;
data tiers;
infile cards dlm=',' dsd;
input STORE TIER $;
cards;
1,Tier A
2,Tier C
3,Tier B
8,Tier B
9,Tier C
53,Tier A
18,Tier C
98,Tier B
;
run;
proc sort data=have; by STORE; run;
proc sort data=tiers; by STORE; run;
data want;
merge have (in=inhave) tiers;
by STORE;
if WEEK le 50 then newvar=10;
else newvar=20;
if upcase(TIER)="TIER A" then newvar=newvar+1;
else if upcase(TIER)="TIER B" then newvar=newvar+2;
else newvar=newvar+3;
if inhave;
run;
proc sort data=want; by var1; run;
Hi,
Thanks for answering.
My apology for not being clear
The datasets I gave above are just part of actual data with dummy values.
as I mentioned in the problem statement the dataset 1 is having 86 variables and 7095 observation. I just showed some of them as
Var 1 | var 2 | WEEK | var 4 | STORE | var 6 | var 7…….var 86 |
a | f | 1 | 0 | 1 | x | …………………… |
b | k | 6 | 0 | 53 | x | …………………… |
c | z | 66 | 0 | 2 | x | …………………… |
d | x | 50 | 0 | 53 | x | …………………… |
e | y | 3 | 0 | 8 | x | …………………… |
…………. | …………. | …………. | …………. | …………. | …………. | …………………… |
…………. | …………. | …………. | …………. | …………. | …………. | …………………… |
this will continue for all 7095 observations....
Also the dataset 2 have 2 variables and 621 unique observations, for this also i have displayed just a part of the data. So using cards;
were you have shown whole data will become a tedious job.
STORE | TIER |
1 | Tier A |
2 | Tier C |
3 | Tier B |
8 | Tier B |
9 | Tier C |
53 | Tier A |
18 | Tier C |
98 | Tier B |
continued till 621 rows with 621 unique Store values....
To your question the final want dataset is an update on dataset 1 with a new_variable added to it. It will look like:
Var 1 | Var 2 | WEEK | Var 4 | STORE | Var 6 | Var 7 | var 8…………….var 85 | var 86 | new_variable |
a | f | 1 | 0 | 1 | x | y | …………………………….. | 7 j | 11 |
b | k | 6 | 0 | 53 | x | y | …………………………….. | j u 6 | 11 |
c | z | 66 | 0 | 2 | x | y | …………………………….. | next 7 | 23 |
d | x | 50 | 0 | 53 | x | y | …………………………….. | j e s t k #2 | 11 |
e | y | 3 | 0 | 8 | x | y | …………………………….. | l k j l_87 | 12 |
f | j | 51 | 0 | 9 | x | y | …………………………….. | j k j_769 | 23 |
g | m | 49 | 0 | 8 | x | y | …………………………….. | 78_65_k j u | 12 |
Here the new_variable is based on the logic as:
Tier A | Tier B | Tier C | |
Pre-week (week 1 to Week 50) | 11 | 12 | 13 |
Post-week (week 51 to Week 100) | 21 | 22 | 23 |
which means for a given store no.
IF (dataset 1.WEEK <=50) AND (dataset 2.Tier=="Tier A") THEN new_variable=11
ELSE IF (dataset 1.WEEK <=50) AND (dataset 2.Tier=="Tier B") THEN new_variable=12
ELSE IF (dataset 1.WEEK <=50) AND (dataset 2.Tier=="Tier C") THEN new_variable=13
ELSE IF (dataset 1.WEEK >50) AND (dataset 2.Tier=="Tier A") THEN new_variable=21
ELSE IF (dataset 1.WEEK >50) AND (dataset 2.Tier=="Tier B") THEN new_variable=22
ELSE IF (dataset 1.WEEK >50) AND (dataset 2.Tier=="Tier C") THEN new_variable=23
I hope this might give you some more clarity about the the problem and logic I am trying to apply. Kindly help me.
I think the program i proposed in my first answer does what you want.
Did you try it ?
Hi;
Thanks a lot. The code worked with just small changes. I appreciate your help..Just one thing the values for new_variable added are moving to the next row. and when I create the CSV file the values for new variable is coming below the 2nd variable. You can see the snapshot in the file attached. The new_variable is Named here as GRP_NUM. Can you help me with it
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.