BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ronin
Obsidian | Level 7

Hi I am very new to SAS and coding. I have a situation I have two datasets as below

1) Dataset 1:(with 86 variables and 7095 observation)

Var 1var 2WEEKvar 4STOREvar 5
af101x
bk6053x
cz6602x
dx50053x
ey308x

 

Here WEEK column above have values from week 1 to week 100

2) Dataset 2:(with 2 variables and 621 unique observation)

STORETIER
1Tier A
2Tier C
3Tier B
8Tier B
9Tier C
53Tier A
18Tier C
98Tier B

Here Tier column has values as Tier A,B,C for all 621 unique store numbers

 

I want to add a new column in dataset 1 as store type with below logic:

 

 Tier ATier BTier C
Pre-week (week 1 to Week 50)111213
Post-week (week 51 to Week 100)212223

 

So the new column will have numerically encoded values as 11,12,13,21,22,23 based on range of week from WEEK column in dataset 1 and mapped Tier from dataset 2.

 

Kindly help me with some sort of function code for this.

1 ACCEPTED SOLUTION

Accepted Solutions
gamotte
Rhodochrosite | Level 12

Hello,

 

Please show what the want dataset should look like. Does the following program

answer your question ?

 

data have;
    input var1 $ var2 $ WEEK var4 STORE var5 $;
    cards;
a f 1 0 1 x
b k 6 0 53 x
c z 66 0 2 x
d x 50 0 53 x
e y 3 0 8 x
;
run;

data tiers;
    infile cards dlm=',' dsd;
    input STORE TIER $;
    cards;
1,Tier A
2,Tier C
3,Tier B
8,Tier B
9,Tier C
53,Tier A
18,Tier C
98,Tier B
;
run;

proc sort data=have; by STORE; run;
proc sort data=tiers; by STORE; run;

data want;
    merge have (in=inhave) tiers;
    by STORE;

    if WEEK le 50 then newvar=10;
    else newvar=20;

    if upcase(TIER)="TIER A" then newvar=newvar+1;
    else if  upcase(TIER)="TIER B" then newvar=newvar+2;
    else newvar=newvar+3;

    if inhave;
run;

proc sort data=want; by var1; run;

View solution in original post

4 REPLIES 4
gamotte
Rhodochrosite | Level 12

Hello,

 

Please show what the want dataset should look like. Does the following program

answer your question ?

 

data have;
    input var1 $ var2 $ WEEK var4 STORE var5 $;
    cards;
a f 1 0 1 x
b k 6 0 53 x
c z 66 0 2 x
d x 50 0 53 x
e y 3 0 8 x
;
run;

data tiers;
    infile cards dlm=',' dsd;
    input STORE TIER $;
    cards;
1,Tier A
2,Tier C
3,Tier B
8,Tier B
9,Tier C
53,Tier A
18,Tier C
98,Tier B
;
run;

proc sort data=have; by STORE; run;
proc sort data=tiers; by STORE; run;

data want;
    merge have (in=inhave) tiers;
    by STORE;

    if WEEK le 50 then newvar=10;
    else newvar=20;

    if upcase(TIER)="TIER A" then newvar=newvar+1;
    else if  upcase(TIER)="TIER B" then newvar=newvar+2;
    else newvar=newvar+3;

    if inhave;
run;

proc sort data=want; by var1; run;
Ronin
Obsidian | Level 7

Hi,

Thanks for answering.

My apology for not being clear
The datasets I gave above are just part of actual data with dummy values.

as I mentioned in the problem statement the dataset 1 is having 86 variables and 7095 observation. I just showed some of them as 

Var 1var 2WEEKvar 4STOREvar 6var 7…….var 86
af101x……………………
bk6053x……………………
cz6602x……………………
dx50053x……………………
ey308x……………………
………….………….………….………….………….………….……………………
………….………….………….………….………….………….……………………

this will continue for all 7095 observations....

 

Also the dataset 2 have  2 variables and 621 unique observations, for this also i have displayed just a part of the data. So using cards;

were you have shown whole data will become a tedious job.

 

STORETIER
1Tier A
2Tier C
3Tier B
8Tier B
9Tier C
53Tier A
18Tier C
98Tier B

continued till 621 rows with 621 unique Store values....

 

To your question the final want dataset is an update on dataset 1 with a new_variable added to it. It will look like:

Var 1Var 2WEEKVar 4STOREVar 6Var 7var 8…………….var 85var 86new_variable
af101xy……………………………..7 j11
bk6053xy……………………………..j u 611
cz6602xy……………………………..next 723
dx50053xy……………………………..j e s t k #211
ey308xy……………………………..l k j l_8712
fj5109xy……………………………..j k j_76923
gm4908xy……………………………..78_65_k j u12

 

 

Here the new_variable is based on the logic as:

 

 Tier ATier BTier C
Pre-week (week 1 to Week 50)111213
Post-week (week 51 to Week 100)212223

 

which means for a given store no.

IF (dataset 1.WEEK <=50) AND (dataset 2.Tier=="Tier A") THEN new_variable=11

ELSE IF (dataset 1.WEEK <=50) AND (dataset 2.Tier=="Tier B") THEN new_variable=12

ELSE IF (dataset 1.WEEK <=50) AND (dataset 2.Tier=="Tier C") THEN new_variable=13

ELSE IF (dataset 1.WEEK >50) AND (dataset 2.Tier=="Tier A") THEN new_variable=21

ELSE IF (dataset 1.WEEK >50) AND (dataset 2.Tier=="Tier B") THEN new_variable=22

ELSE IF (dataset 1.WEEK >50) AND (dataset 2.Tier=="Tier C") THEN new_variable=23

 

I hope this might give you some more clarity about the the problem and logic I am trying to apply. Kindly help me. 

gamotte
Rhodochrosite | Level 12

I think the program i proposed in my first answer does what you want.

Did you try it ?

Ronin
Obsidian | Level 7

Hi;

Thanks a lot. The code worked with just small changes. I appreciate your help..Just one thing the values for new_variable added are moving to the next row. and when I create the CSV file the values for new variable is coming below the 2nd variable.  You can see the snapshot in the file attached. The new_variable is Named here as GRP_NUM. Can you help me with it

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 4248 views
  • 1 like
  • 2 in conversation