BookmarkSubscribeRSS Feed
ybz12003
Rhodochrosite | Level 12

Hello, experts:

 

I have two datasets.   Dataset1 contains 10K obs and 1000 variables, including IDs, Status, etc.   Dataset2 contains 100 obs and two variables, IDs and Status.   I would like to merge the updated 'Status' in Dataset2 to Dataset1, by same IDs.    The IDs are characters, and the Status is numeric.    I think I could use Proc SQL join to merge, but not sure which join is best and which dataset should put first?  Please advice.

3 REPLIES 3
Kurt_Bremser
Super User

With 100 obs, simply create an informat (since you need to create a numeric value) and apply it during any step where you process the large dataset.

novinosrin
Tourmaline | Level 20

Hi @ybz12003 

 

Thumb rule:

 

1. Check for duplicate ID occurances in both datasets

2. If one has many ID's your larger one, and the other has only unique occurances of ID, it's a one to many join. 

 

So as long as it's one to one or one to many, you can play with merge/join/hash etc.

 

If you determine, both having multiple ID occurrences, then post your sample data and your expected output. Somebody can advice you at best!

PGStats
Opal | Level 21

Do you need to create a new dataset? You could update the existing dataset with something like:

 

proc sql;
update dataset1 as a
set status = (select status from dataset2 where id=a.id)
where id in (select id from dataset2);
quit;

 

PG
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 845 views
  • 0 likes
  • 4 in conversation