Dear Members,
I have data set A and data set B located in sasuser library. I want to multiply column A of data set A to column B of data set B to create a new variable column C. The multiply must take place when SS# is equal in both data sets. Data set A has more obs and columns than data set B. When I run the code below with an if condition, it does not work. I got 0 observation in new_data.
data new_data;
set sasuser.columnA sasuser.columnB;
If SS# = SS# then
column C = column A * column B;
run;
My log error is: Statement is not valid or s used out of proper order.
Not working so far. Would where a better option? Please help.
Thanks
If that's your actual code, then you can't use # in a variable name.
If that's not your actual code, then we need to see your actual code or we can't help you.
data new_data;
set sasuser.columnA sasuser.columnB;
If SS = SS then
column C = column A * column B;
run;
# is not the issue
A SET statement will append the two data sets, but to do your calculation you need the data side by side, more commonly referred to as a MERGE.
There's an entire chapter in the SAS Docs about how to combine data;
It's probably more complicated but should look something like this to start. However, you also need to be careful if there are duplicate entries for each SS and you do need to sort the data first.
data new_data;
merge data1 data2;
by SS;
columnC = ColumnA * ColumnB;
run;
Another link that may be helpful is a YouTube tutorial on merges:
If you're using SAS Studio or EG, consider using the GUI tools/Tasks or Snippets and then looking at the code generated.
Yes there are duplicates. data set A has 60000 observations but data set B has 6000.
@patrickchanou wrote:
Yes there are duplicates. data set A has 60000 observations but data set B has 6000.
I think my answer above should get you started. If you need further help please post some sample data - obviously not your real data, but similar enough to explain the issue. It should include both what you have and what you want. Include your current code and the log that shows whatever errors you're encountering.
Here's an example of how that looks:
If you ask a generic question, you get generic responses. If you need further help, you need to be more detailed.
Good Luck.
data new_data;
set sasuser.dataA sasuser.dataB;
If SS = SS then
column C = column A (from dataA) * column B (from dataB);
run;
dataA has 2 numeric variable columns grade and SS with 60000 obs
dataB has 2 numeric variable columns weight with SS with 6000 obs
SS is the column we want to use to sort and do the multiplication by. There are lots of duplicates in the column SS
Let me know if that explains better.
This is an example of taking all the records from column A data set and if there is a matching SSN in the column B dataset add the column b value to the data and multiply as described.
proc sql; create table want as select a.ssn, a.columna, b.columnb, a.columna*b.columnb as columnc from sasuser.columna as a left join sasuser.columnb as b on a.ssn=b.ssn ; quit;
The a. and b. are ways that Proc SQL uses as shorthand for the data sets the assignment of the short hand is in the FROM clause with optional as .
You did not indicate if any of the SSN values might be duplicated in either set or what to do with the duplicates. This multiplies all of the matches if there are duplicates. If the SSN in the ColumnA data set only appears once but is in ColumnB multiple times there will be one record for each match in ColumnB.
Million thanks ballardw. They are many duplicates. I want them multiplied also. I will run the code and advise further based on the results.
I did run the code. It generated 1 error. It is not finding Sampling_weight or columnb but Sampling_weight is located in data set b in sasuser library. See below.
It's saying you don't have a dataset named sample_weight, check for a typo in the name.
If you're certain you have it, please display a proc contents.
proc contents data=sasuser.sampling_weights;run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.