Hi,
Can someone help me to write below excel formula in SAS.
a2 has "First Number" and some other values like First, second, third, Third Number, etc.
b2 has some values like '1234' these are char format and Names a:b has '1234' and something like '1346' chat format and 2 is Yes/No
b2 has some values like '1234' these are char format and Names a:b has '1234' and something like '1346' chat format and 2 is Yes/No
c2 is also char format and a values from matching sheet will matching with c2 values and 2 is Yes/No
=if(right(a2,6)="Number","no",
if(iferror(vlookup(b2,'Names'!$a:$b,2,false),"no")="yes","no",
if(iferror(vlookup(b2,'Location'!$a:$b,2,false),"no")="yes","no",
vlookup(c2,'Matching'!$a:$b,2,false))))
Main data
A | B | C |
First | 1234 | 1A |
First Number | 1348 | 11A |
Second Number | 1236 | B4 |
Second | 1345 | B7 |
Names data
A | B |
1234 | No |
1348 | Yes |
1236 | Yes |
1345 | No |
Location data
A | B |
1234 | No |
1348 | Yes |
1236 | Yes |
1345 | No |
Matching data
A | B |
1A | Yes |
11A | No |
B4 | No |
B7 | Yes |
Thanks,
vnreddy
Can you please rephrase your Excel syntax into pseudo-code or business language?
We first check if the last 6 characters of the column a equal "Number".
If they do, the result is set to "No".
If not, we check if the result of the VLOOKUP function on the 'Names' dataset based on column B & a is "Yes".
If it is, the result is set to "No".
If not, we check if the result of the VLOOKUP function on the 'Location' dataset based on column B & a is "Yes".
If it is, the result is set to "No".
If neither of the above conditions is met, we perform another VLOOKUP on the 'Matching' dataset and set the result accordingly.
Just explain WHAT you are trying to do. HOW to do it is what you are trying to learn.
Also if you want to describe the data accurately then share it as data step code that recreates the dataset. That way the dataset names, variable names, variable types and whether any of the variables need special display formats attached is clear. And use reasonable names for the variables. A, B and C are not good names as they impart no information about what the variable contains. (I also do not understand what your datasets named LOCATION and NAMES really mean either.)
Sounds like you have three datasets. Perhaps something like this:
data HAVE;
infile cards dsd truncover;
length NAME $20 CODE $8 LOCATION $8 ;
input NAME CODE LOCATION;
cards;
First,1234,1A
First Number,1346,11A
Second Number,1235,B4
Second,1345,B7
;
data names ;
length CODE $8 NAME_STATUS $3 ;
input CODE name_status;
cards;
1234 No
1348 Yes
1236 Yes
1345 No
;
data location ;
length CODE $8 LOCATION_STATUS $3 ;
input code location_status;
cards;
1234 No
1348 Yes
1236 Yes
1345 No
;
Looks like they only have the CODE variable in common.
So try sorting by that variable and merging.
proc sort data=have; by code; run;
proc sort data=names; by code; run;
proc sort data=location; by code; run;
data want;
merge have names location;
by code;
run;
Result
NAME_ LOCATION_ Obs NAME CODE LOCATION STATUS STATUS 1 First 1234 1A No No 2 Second Number 1235 B4 3 1236 Yes Yes 4 Second 1345 B7 No No 5 First Number 1346 11A 6 1348 Yes Yes
Is that what you are trying to do?
If not what is it you are doing?
sorry my bad, now i have corrected main dataset.
I need one new variable in main dataset with status Yes/No, if below conditions matches.
In my original dataset i have almost 40 variables.
We first check if the last 6 characters of the column a equal "Number".
If they do, the result is set to "No".
If not, we check if the result of the VLOOKUP function on the 'Names' dataset based on column B & a is "Yes".
If it is, the result is set to "No".
If not, we check if the result of the VLOOKUP function on the 'Location' dataset based on column B & a is "Yes".
If it is, the result is set to "No".
If neither of the above conditions is met, we perform another VLOOKUP on the 'Matching' dataset and set the result accordingly.
If the size of the "VLOOKUP" "tables" are reasonable in size then just make them into formats. So let's convert your example listings into PROC FORMAT code. Then we can just code you IF/THEN sequence exactly as you have written it.
resetline;
data HAVE;
infile cards dsd truncover;
length NAME $20 CODE $8 LOCATION $8 ;
input NAME CODE LOCATION;
cards;
First,1234,1A
First Number,1346,11A
Second Number,1235,B4
Second,1345,B7
;
proc format;
value $name
'1234'='No'
'1236'='Yes'
'1345'='No'
'1348'='Yes'
;
value $location
'1234'='No'
'1348'='Yes'
'1236'='Yes'
'1345'='No'
;
value $matching
'1A' ='Yes'
'11A'='No'
'B4' ='No'
'B7' ='Yes'
;
run;
data want;
set have;
length result $3 ;
if scan(NAME,-1,' ')='Number' then result='No';
else if put(code,$name.)='Yes' then result='No';
else if put(code,$location.)='Yes' then result='No';
else result=put(location,$matching.) ;
run;
Results:
Obs NAME CODE LOCATION result 1 First 1234 1A Yes 2 First Number 1346 11A No 3 Second Number 1235 B4 No 4 Second 1345 B7 Yes
Note: Since you already have the lookups in a WORKSHEET you can look at the CNTLIN= option of PROC FORMAT in the documentation to learn how to create the formats from the existing dataset.
for sample i have provided 4 vlookup values, but in reality i end up having approximately, 1200 vlookup values to check against 20 thousand records from main dataset.
I think in this case i can't use format
1,200 values is not very many (its not 1977 any more).
Convert your "lookup" tables into datasets that can be used to create formats (or informats if you need) by PROC FORMAT with CNTLIN= option.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.