BookmarkSubscribeRSS Feed
veblen
Calcite | Level 5
I appear to having a mental block this afternoon...

What's the simplest way to create a var based on:

where dataset1 varA = dataset2 varA

then dataset1A newvar = dataset2 varB

Basically, I'm looking to add a new var in dataset1 that contains the value from dataset2 varB where dataset1 varA = dataset2 varA.

Thanks!!!
11 REPLIES 11
Reeza
Super User
Probably using Proc SQL

proc sql;
create table a as
select a.*, b.varb
from a
left join b
on a.vara=b.vara;
quit;

Problem is having multiples, then you need to figure out how to deal with those, depending on your data logic.

If you create an empty var in an earlier step you can update instead of creating a new table.
veblen
Calcite | Level 5
Thanks Reeza!!!

Is it better to add an empty var to dataset A and just update the data as you mention?

And if so, how would the code look?
Reeza
Super User
If the dataset is large and creating a second copy of the dataset would be cumbersome then this is a more efficient way. However if you have duplicate results in your join you can run into issues.
[pre]
data have1;
set sashelp.class;
*need to set space for it;
format empty_var $25.;
empty_var='';
run;

proc sql;
update have1 as a
set empty_var=(select b.name from sashelp.class as b where a.name=b.name);
quit;

[/pre]
veblen
Calcite | Level 5
Thanks Reeza!!!

I will let you know if I have any questions.
veblen
Calcite | Level 5
In my world I currently have two datasets. Can you please match these datasets and vars to your code. I am missing something when I get to the select line.

dataset1: contains vara (the dataset to add empty_var?)
dataset2: contains varb and the value to set empty_var to)

I see that have1 is the new dataset with the added var (empty_var). but whicj dataset is sashelp.class?

Thanks!!!
stateworker
Fluorite | Level 6
data set3;
merge dataset1 (in=a) dataset2 (in=b keep=var1 var2);
by var1;
if a;
run;

This will make a new data set (never overwrite your original data set when testing out code!) by merging your first and second data sets by your common variable. If dataset1 has var1's that aren't in dataset2, then set3.var2 will be empty. But where dataset1's var1 equals dataset2's var1, then set3.var2 will be the same as dataset2.var2.

However, if you need dataset2's var1's that aren't also in dataset1, then this won't work because it will only bring in those where the var1 is in common.
veblen
Calcite | Level 5
Thanks Stateworker!!!

but in your example, do the two datasets have to have an equal number of observations?

If so, it's not going to work since the two datasets are of different size.

Basically, I'm looking to add a new var in dataset1 that contains the value from dataset2 varB where dataset1 varA = dataset2 varA.

note: dataset1 and dataset2 do not contain an equal number of observations.
Ksharp
Super User
That is not issue.SAS will process two dataset with not matched observations.
But If your dataset is large ,I recommend you to use HashTable.


Ksharp
veblen
Calcite | Level 5
Thanks Ksharp!!!

What size of dataset do you consider large?

(currently, I am not sure of what my maximum number of observations will be, but will do some analysis.)

stateworker: Based on Ksharp's input, I will give your code a try. Thanks again!!! Message was edited by: veblen
veblen
Calcite | Level 5
the answer to this question should resolve the issue for me:

what is the proper syntax for defining the two datasets within the proc sql and data steps posted?

is it library.name?

for example: mydata.datasetA or sasdata.datasetB

or is it something else?

Once I understand the calling and indexing into datasets, I will be able to do what I need to do with the language.
Ksharp
Super User
Yes.You are right.
[pre]
Data data1;
input Date : yymmdd10. Exdate : yymmdd10. CP_FLAG $ Volum Strike_Price Settle_Price ;
format date yymmdd10. exdate :yymmdd10. ;
datalines;
2008/1/2 2008/1/16 C 200 27.0 23.15
2008/1/2 2008/1/16 P 300 27.0 23.15
2008/1/2 2008/1/16 C 552 22.5 23.15
2008/1/2 2008/1/16 P 561 22.5 23.15
2008/1/3 2008/1/16 C 756 25.0 23.42
2008/1/3 2008/1/16 P 654 25.0 23.42
2008/1/3 2008/1/16 C 783 22.5 23.42
2008/1/3 2008/1/16 P 150 22.5 23.42
2008/1/4 2008/1/16 C 276 22.5 27.00
2008/1/4 2008/1/16 P 246 22.5 27.00
2008/1/4 2008/1/16 C 276 25.0 27.00
2008/1/4 2008/1/16 P 221 25.0 27.00
2008/1/4 2008/1/16 C 344 27.5 27.00
2008/1/4 2008/1/16 P 476 27.5 27.00
2008/1/5 2008/1/16 C 222 22.5 24.00
2008/1/5 2008/1/16 P 343 22.5 24.00
2008/1/5 2008/1/16 C 768 25.0 24.00
2008/1/5 2008/1/16 P 343 25.0 24.00
2008/1/5 2008/1/16 C 213 27.5 24.00
2008/1/5 2008/1/16 P 343 27.5 24.00
;
run;
proc sql;
create table want as
select *
from data1
group by date
having abs(Strike_Price-Settle_Price) =
min(abs(Strike_Price-Settle_Price))
;
quit;
proc sort data=want nodupkey out=op1(keep=date strike_price);
by date strike_price;
run;
data op1;
set op1;
_strike_price=lag(strike_price);
drop strike_price;
run;
data temp(where=(strike_price=_strike_price));
merge work.data1 work.op1;
by date;
;[/pre]

the bold part is to stand for library name of table data1 and op1.
Namely,I merge the tables data1 and op1 which are all in WORK library.


Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1638 views
  • 0 likes
  • 4 in conversation