I'm not sure what's going on.
I have these two datasets
data have1;
input id value1 $;
cards;
1 a
2 a
;
data have2;
input id zipcode;
cards;
1 88888
1 99999
2 88888
2 99999
;
And I want this
data want;
input id value1 $ zipcode;
cards;
1 a 88888
1 a 99999
2 a 88888
2 a 99999
;
I tried merging the two datasets by id, but it is not repeating the way i want. I feel like I'm missing something obvious.
proc sql;
create table working.p42 as
select a.*, b.zipcode
from working.ucmr3 as a
inner join working.ucmr3_zip as b
on a.pwsid = b.pwsid;
quit;
Chat GPT helped me with this. This code worked.
data want;
merge have1 have2;
by id;
run;
Show your merge code, please. When I ran this:
data want;
merge have1 have2;
by id;
run;
proc print;
run;
The result was this:
Obs | id | value1 | zipcode |
1 | 1 | a | 88888 |
2 | 1 | a | 99999 |
3 | 2 | a | 88888 |
4 | 2 | a | 99999 |
Isn't that exactly what you wanted?
For some reason that merge statement isn't doing what I want it to do.
I want each zipcode value to repeat for each pwsid entry, but it's not doing that.
proc sql;
create table working.p42 as
select a.*, b.zipcode
from working.ucmr3 as a
inner join working.ucmr3_zip as b
on a.pwsid = b.pwsid;
quit;
Chat GPT helped me with this. This code worked.
You would have to provide examples of the data sets and the code.
You don't even tell us exactly what is wrong with that "example" data.
IF your "merge" was by PWSID then to get all Zipcodes duplicated from a merged data set then PWSID could only appear once in the base set and then have the value of PWSID with each desired Zip code in the other set. Once you have duplicates of a BY variable in two or more sets in a data step merge the results often get much more problematic because of the way the Merge statement works. Did you read the log? Did it have anything about "more than one data set with repeats of BY values" ? That warning tells you that you likely don't have the desired result because if there are N duplicates of a By variable valie in one set and M duplicates of the same value in the other the resulting set will generally have one of N or M resulting by variable values depending on the data sets.
Example:
Data ex1; input id x $; datalines; 1 a 1 b 1 c 2 q ; data ex2; input id y $; datalines; 1 zz 1 yy 2 qq 2 hh ; data combined; merge ex1 ex2; by id; run;
Note that the output data set Combined will have 5 observation not the 8 you were wanting, if I understand: 3 values of Id=1 (not the 6 that a 3x2 cross would generate) and 2 values of Id=2.
Do note the LOG (emphasis added):
19 data combined; 20 merge ex1 ex2; 21 by id; 22 run; NOTE: MERGE statement has more than one data set with repeats of BY values. NOTE: There were 4 observations read from the data set WORK.EX1. NOTE: There were 4 observations read from the data set WORK.EX2. NOTE: The data set WORK.COMBINED has 5 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.02 seconds cpu time 0.01 seconds
Also note that the single Id=2 did get both the other values added.
@JJ_83 wrote:
For some reason that merge statement isn't doing what I want it to do.
I want each zipcode value to repeat for each pwsid entry, but it's not doing that.
I don't need more feedback, it's been solved. Thank you.
So that picture looks nothing like your original example. Which variables corresponds with ID and VALUE1 that you had in your example?
What do you mean you want ZIPCODE to "repeat"? In your picture there are many values or ZIPCODE already shown. How is it not repeating?
Note that your example showed a 1 to N match. Each observation in HAVE1 was merged with zero or more observations in HAVE2. That is what a MERGE can handle.
If you instead want an N to M match then you cannot use a MERGE. Instead you will need to preform a cartesian product, like what the INNER JOIN of SQL produces.
I don't need more feedback, it's been solved. Thank you.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.