I have two datasets: lifeguards and weather.
The lifeguards set is workshift and name
day name
| Sun | Alice | 
| Sun | Bobby | 
| Sun | Charlie | 
| Fri | Danny | 
| Fri | Ellie | 
| Sat | Fran | 
| Sat | Gemma | 
The weather set is day and high temp
day Temp
| Sun | 39 | 
| Mon | 35 | 
| Tue | 36 | 
| Wed | 36 | 
| Thu | 40 | 
| Fri | 38 | 
| Sat | 39 | 
I want to match merge them so they look like
day name Temp
| Sun | Alice | 39 | 
| Sun | Bobby | 39 | 
| Sun | Charlie | 39 | 
| Mon | . | 35 | 
| Tue | . | 36 | 
| Wed | . | 36 | 
| Thu | . | 40 | 
| Fri | Danny | 38 | 
| Fri | Ellie | 38 | 
| Sat | Fran | 39 | 
| Sat | Gemma | 39 | 
So I wrote:
data want; merge lifeguards weather; by day; run;
But I did not get what I want. I got this instead:
day name Temp
| Sun | Alice | . | 
| Sun | Bobby | . | 
| Sun | Charlie | . | 
| Sun | . | 39 | 
| Mon | . | 35 | 
| Tue | . | 36 | 
| Wed | . | 36 | 
| Thu | . | 40 | 
| Fri | Danny | . | 
| Fri | Ellie | . | 
| Fri | . | 38 | 
| Sat | Fran | . | 
| Sat | Gemma | . | 
| Sat | . | 39 | 
Please help me figure out where I went wrong. I tried it with "in" assignments and using update instead of merge. I don't understand proc sql, so if that's part of your solution, please explain the syntax to me?
The bit with the Day not aligning likely means that there is actually some difference in the value stored between the lifeguard and weather set. Possible things to look for are leading spaces in one of the sets such as the value is " Sun" in one and the other is "Sun". Another possible issue is that the Format assigned to one of the sets is a $3. format but the length of the values is actually longer. So the value could actually be "Sunday" in one.
How to check if one of these cases may apply.
Run Proc contents on both sets. See if one has a shorter length defined than the other and/or the Format is much shorter than the defined length of the variable.
You can check for leading spaces in data step code such as this:
data example; x = " Sun"; lengthx = length(x); y = "Sun"; lengthy = length(y); run;
If you print that Example data set above you will see the output doesn't show the leading spaces. Most of the procedures will be display the values without leading spaces.
The fix would be to use a data step to make the values in one of the sets look like that of the other. If the issue is leading spaces only then the LEFT function may be appropriate. If the value is longer then the SUBSTR function can be used to shorten the longer values.
Does you log show anything about unequal lengths for Day?
The real common variable isn't actually a Char., it's a Date8., and it's Date8. in both tables. I am perplexed as this is not my first rodeo with merging and concatenating tables.
Hi:
I think you need more control over the output using IN=. You have 2 scenarios in what you want in your final output:
1) rows match in both weather and lifeguard tables
2) have a row in weather, but no lifeguard on that day
3) unlikely -- don't have a row in weather but do have a lifeguard for that day (probable error condition)
At any rate, I only accounted for #1 and #2 in this example using a numeric date variable:
Hope this helps,
Cynthia
I see what you're doing with that, and it obviously yielded the right table.
When I run the same code (variable names changed to protect the innocent), a peculiar thing happens: All the values from my second table get erased in the merge.
@suztang wrote:
The real common variable isn't actually a Char., it's a Date8., and it's Date8. in both tables. I am perplexed as this is not my first rodeo with merging and concatenating tables.
Print some of the date values with a non-date format. If some of the values have decimal portion this is could happen and will not be visible when a date format like DOWNAME3 used. Date8. doesn't display anything resembling Sun, Date8 would display something like 16Mar12 and be right justified in 8 characters making it appear to have a leading space. So why did you show values like Sun????? if that is the format
data example; x= 22222.01; y= 22222; put x= downame3. y= dowwname3. ;
put x= date8. y= date8.;
/* but obviously the values of x and y are different*/ run;
And not telling sufficient detail about your data does not let us help you.
I still say there is something different in the values.
Follow this thread to understand merging
Hope this helps
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
