BookmarkSubscribeRSS Feed
vnreddy
Quartz | Level 8

Hi,

 

Can someone help me to write below excel formula in SAS.

a2 has "First Number" and some other values like First, second, third, Third Number, etc.

b2 has some values like '1234' these are char format and Names a:b has '1234' and something like '1346'  chat format and 2 is Yes/No

b2 has some values like '1234' these are char format and Names a:b has '1234' and something like '1346'  chat format and 2 is Yes/No

 c2 is also char format and a values from matching sheet will matching with c2 values and 2 is Yes/No

 

=if(right(a2,6)="Number","no",
if(iferror(vlookup(b2,'Names'!$a:$b,2,false),"no")="yes","no",
if(iferror(vlookup(b2,'Location'!$a:$b,2,false),"no")="yes","no",
vlookup(c2,'Matching'!$a:$b,2,false))))

 

Main data

A B C
First 1234 1A
First Number 1348 11A
Second Number 1236 B4
Second 1345 B7

 

Names data

A B
1234 No
1348 Yes
1236 Yes
1345 No

 

Location data 

A B
1234 No
1348 Yes
1236 Yes
1345 No

 

Matching data

A B
1A Yes
11A No
B4 No
B7 Yes

 

Thanks,

vnreddy

9 REPLIES 9
LinusH
Tourmaline | Level 20

Can you please rephrase your Excel syntax into pseudo-code or business language?

Data never sleeps
vnreddy
Quartz | Level 8

 

We first check if the last 6 characters of the column a equal "Number".
If they do, the result is set to "No".
If not, we check if the result of the VLOOKUP function on the 'Names' dataset based on column B  & a is "Yes".
If it is, the result is set to "No".
If not, we check if the result of the VLOOKUP function on the 'Location' dataset based on column B  & a is "Yes".
If it is, the result is set to "No".
If neither of the above conditions is met, we perform another VLOOKUP on the 'Matching' dataset and set the result accordingly.

LinusH
Tourmaline | Level 20
Not gfiving you the full solution, but something to start with:
First join all datasets on
main.b = names.a and
main.b = location.a and
main.c = matching.a

The scan() function lets you search a string for the n-the word.
If you chose to use SQL then CASE statements will help.
In a data step use IF-statments.
Data never sleeps
Tom
Super User Tom
Super User

Just explain WHAT you are trying to do.  HOW to do it is what you are trying to learn.

 

Also if you want to describe the data accurately then share it as data step code that recreates the dataset.  That way the dataset names, variable names, variable types and whether any of the variables need special display formats attached is clear.  And use reasonable names for the variables. A, B and C are not good names as they impart no information about what the variable contains.  (I also do not understand what your datasets named LOCATION and NAMES really mean either.)

 

Sounds like you have three datasets. Perhaps something like this:

data HAVE;
  infile cards dsd truncover;
  length NAME $20 CODE $8 LOCATION $8 ;
  input NAME CODE LOCATION;
cards;
First,1234,1A
First Number,1346,11A
Second Number,1235,B4
Second,1345,B7
;

data names ;
  length CODE $8 NAME_STATUS $3 ;
  input CODE name_status;
cards;
1234 No
1348 Yes
1236 Yes
1345 No
; 

data location ;
  length CODE $8 LOCATION_STATUS $3 ;
  input code location_status;
cards;
1234 No
1348 Yes
1236 Yes
1345 No
;

Looks like they only have the CODE variable in common.

So try sorting by that variable and merging.

proc sort data=have; by code; run;
proc sort data=names; by code; run;
proc sort data=location; by code; run;

data want;
  merge have names location;
  by code;
run;

Result

                                            NAME_     LOCATION_
Obs    NAME             CODE    LOCATION    STATUS     STATUS

 1     First            1234      1A         No          No
 2     Second Number    1235      B4
 3                      1236                 Yes         Yes
 4     Second           1345      B7         No          No
 5     First Number     1346      11A
 6                      1348                 Yes         Yes

Is that what you are trying to do?

If not what is it you are doing?

 

 

 

vnreddy
Quartz | Level 8

sorry my bad, now i have corrected main dataset.

 

I need one new variable in main dataset with status Yes/No, if below conditions matches.

In my original dataset i have almost 40 variables.

 

We first check if the last 6 characters of the column a equal "Number".
If they do, the result is set to "No".
If not, we check if the result of the VLOOKUP function on the 'Names' dataset based on column B  & a is "Yes".
If it is, the result is set to "No".
If not, we check if the result of the VLOOKUP function on the 'Location' dataset based on column B  & a is "Yes".
If it is, the result is set to "No".
If neither of the above conditions is met, we perform another VLOOKUP on the 'Matching' dataset and set the result accordingly.

Tom
Super User Tom
Super User

If the size of the "VLOOKUP" "tables" are reasonable in size then just make them into formats.  So let's convert your example listings into PROC FORMAT code.   Then we can just code you IF/THEN sequence exactly as you have written it.

resetline;

data HAVE;
  infile cards dsd truncover;
  length NAME $20 CODE $8 LOCATION $8 ;
  input NAME CODE LOCATION;
cards;
First,1234,1A
First Number,1346,11A
Second Number,1235,B4
Second,1345,B7
;

proc format;
 value $name 
   '1234'='No'
   '1236'='Yes'
   '1345'='No'
   '1348'='Yes'
 ;

 value $location 
   '1234'='No'
   '1348'='Yes'
   '1236'='Yes'
   '1345'='No'
  ;

  value $matching
    '1A' ='Yes'
    '11A'='No'
    'B4' ='No'
    'B7' ='Yes'
  ;
run;

data want;
  set have;
  length result $3 ;
  if scan(NAME,-1,' ')='Number' then result='No';
  else if put(code,$name.)='Yes' then result='No';
  else if put(code,$location.)='Yes' then result='No';
  else result=put(location,$matching.) ;
run;

Results:

Obs    NAME             CODE    LOCATION    result

 1     First            1234      1A         Yes
 2     First Number     1346      11A        No
 3     Second Number    1235      B4         No
 4     Second           1345      B7         Yes

 

 

 

Note: Since you already have the lookups in a WORKSHEET you can look at the CNTLIN= option of PROC FORMAT in the documentation to learn how to create the formats from the existing dataset.

 

 

vnreddy
Quartz | Level 8

for sample i have provided 4 vlookup values, but in reality i end up having approximately, 1200 vlookup values to check against 20 thousand records from main dataset.

I think in this case i can't use format 

Tom
Super User Tom
Super User

1,200 values is not very many (its not 1977 any more).

 

Convert your "lookup" tables into datasets that can be used to create formats (or informats if you need) by PROC FORMAT with CNTLIN= option.

ghoststark
Calcite | Level 5

In SAS, you could use a combination of IF statements and PROC SQL for the lookups. Here's how you might structure it: data your_data;


set your_data;
if substr(a2, length(a2)-5, 6) = 'Number' then result = 'no';
else if (vlookup(b2, 'Names', 2) = 'yes') or (vlookup(b2, 'Location', 2) = 'yes') then result = 'no';
else result = vlookup(c2, 'Matching', 2);
run;


The vlookup function isn’t available directly in SAS, so you’ll likely need to use PROC SQL for those lookups or use an ARRAY structure if the lookup data is in a table format. I remember struggling with Excel to SAS translation, and what helped me was learning how to view formulas in Excel. It allowed me to break down the logic clearly before attempting the conversion, which saved me tons of time.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2435 views
  • 0 likes
  • 4 in conversation