BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ronin
Obsidian | Level 7

Hi I am very new to SAS and coding. I have a situation I have two datasets as below

1) Dataset 1:(with 86 variables and 7095 observation)

Var 1var 2WEEKvar 4STOREvar 5
af101x
bk6053x
cz6602x
dx50053x
ey308x

 

Here WEEK column above have values from week 1 to week 100

2) Dataset 2:(with 2 variables and 621 unique observation)

STORETIER
1Tier A
2Tier C
3Tier B
8Tier B
9Tier C
53Tier A
18Tier C
98Tier B

Here Tier column has values as Tier A,B,C for all 621 unique store numbers

 

I want to add a new column in dataset 1 as store type with below logic:

 

 Tier ATier BTier C
Pre-week (week 1 to Week 50)111213
Post-week (week 51 to Week 100)212223

 

So the new column will have numerically encoded values as 11,12,13,21,22,23 based on range of week from WEEK column in dataset 1 and mapped Tier from dataset 2.

 

Kindly help me with some sort of function code for this.

1 ACCEPTED SOLUTION

Accepted Solutions
gamotte
Rhodochrosite | Level 12

Hello,

 

Please show what the want dataset should look like. Does the following program

answer your question ?

 

data have;
    input var1 $ var2 $ WEEK var4 STORE var5 $;
    cards;
a f 1 0 1 x
b k 6 0 53 x
c z 66 0 2 x
d x 50 0 53 x
e y 3 0 8 x
;
run;

data tiers;
    infile cards dlm=',' dsd;
    input STORE TIER $;
    cards;
1,Tier A
2,Tier C
3,Tier B
8,Tier B
9,Tier C
53,Tier A
18,Tier C
98,Tier B
;
run;

proc sort data=have; by STORE; run;
proc sort data=tiers; by STORE; run;

data want;
    merge have (in=inhave) tiers;
    by STORE;

    if WEEK le 50 then newvar=10;
    else newvar=20;

    if upcase(TIER)="TIER A" then newvar=newvar+1;
    else if  upcase(TIER)="TIER B" then newvar=newvar+2;
    else newvar=newvar+3;

    if inhave;
run;

proc sort data=want; by var1; run;

View solution in original post

4 REPLIES 4
gamotte
Rhodochrosite | Level 12

Hello,

 

Please show what the want dataset should look like. Does the following program

answer your question ?

 

data have;
    input var1 $ var2 $ WEEK var4 STORE var5 $;
    cards;
a f 1 0 1 x
b k 6 0 53 x
c z 66 0 2 x
d x 50 0 53 x
e y 3 0 8 x
;
run;

data tiers;
    infile cards dlm=',' dsd;
    input STORE TIER $;
    cards;
1,Tier A
2,Tier C
3,Tier B
8,Tier B
9,Tier C
53,Tier A
18,Tier C
98,Tier B
;
run;

proc sort data=have; by STORE; run;
proc sort data=tiers; by STORE; run;

data want;
    merge have (in=inhave) tiers;
    by STORE;

    if WEEK le 50 then newvar=10;
    else newvar=20;

    if upcase(TIER)="TIER A" then newvar=newvar+1;
    else if  upcase(TIER)="TIER B" then newvar=newvar+2;
    else newvar=newvar+3;

    if inhave;
run;

proc sort data=want; by var1; run;
Ronin
Obsidian | Level 7

Hi,

Thanks for answering.

My apology for not being clear
The datasets I gave above are just part of actual data with dummy values.

as I mentioned in the problem statement the dataset 1 is having 86 variables and 7095 observation. I just showed some of them as 

Var 1var 2WEEKvar 4STOREvar 6var 7…….var 86
af101x……………………
bk6053x……………………
cz6602x……………………
dx50053x……………………
ey308x……………………
………….………….………….………….………….………….……………………
………….………….………….………….………….………….……………………

this will continue for all 7095 observations....

 

Also the dataset 2 have  2 variables and 621 unique observations, for this also i have displayed just a part of the data. So using cards;

were you have shown whole data will become a tedious job.

 

STORETIER
1Tier A
2Tier C
3Tier B
8Tier B
9Tier C
53Tier A
18Tier C
98Tier B

continued till 621 rows with 621 unique Store values....

 

To your question the final want dataset is an update on dataset 1 with a new_variable added to it. It will look like:

Var 1Var 2WEEKVar 4STOREVar 6Var 7var 8…………….var 85var 86new_variable
af101xy……………………………..7 j11
bk6053xy……………………………..j u 611
cz6602xy……………………………..next 723
dx50053xy……………………………..j e s t k #211
ey308xy……………………………..l k j l_8712
fj5109xy……………………………..j k j_76923
gm4908xy……………………………..78_65_k j u12

 

 

Here the new_variable is based on the logic as:

 

 Tier ATier BTier C
Pre-week (week 1 to Week 50)111213
Post-week (week 51 to Week 100)212223

 

which means for a given store no.

IF (dataset 1.WEEK <=50) AND (dataset 2.Tier=="Tier A") THEN new_variable=11

ELSE IF (dataset 1.WEEK <=50) AND (dataset 2.Tier=="Tier B") THEN new_variable=12

ELSE IF (dataset 1.WEEK <=50) AND (dataset 2.Tier=="Tier C") THEN new_variable=13

ELSE IF (dataset 1.WEEK >50) AND (dataset 2.Tier=="Tier A") THEN new_variable=21

ELSE IF (dataset 1.WEEK >50) AND (dataset 2.Tier=="Tier B") THEN new_variable=22

ELSE IF (dataset 1.WEEK >50) AND (dataset 2.Tier=="Tier C") THEN new_variable=23

 

I hope this might give you some more clarity about the the problem and logic I am trying to apply. Kindly help me. 

gamotte
Rhodochrosite | Level 12

I think the program i proposed in my first answer does what you want.

Did you try it ?

Ronin
Obsidian | Level 7

Hi;

Thanks a lot. The code worked with just small changes. I appreciate your help..Just one thing the values for new_variable added are moving to the next row. and when I create the CSV file the values for new variable is coming below the 2nd variable.  You can see the snapshot in the file attached. The new_variable is Named here as GRP_NUM. Can you help me with it

 

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 5364 views
  • 1 like
  • 2 in conversation