BookmarkSubscribeRSS Feed
Mariloud
Obsidian | Level 7

Hi everyone,

I'm getting missing values after combining these 3 dataset as seen on the attached photos. There are no missing values in the key variables. I really appreciate  anyone's help!

 

IMG_4695.jpgIMG_4696.jpg

8 REPLIES 8
PaigeMiller
Diamond | Level 26

Several possibilities that I can think of

  1. Data set A (content_) and data set C (reaction_types) do not have matching values of variable named TYPE. Could be spelling mismatches, or capitalization mismatches, or other types of mismatches.
  2. Variables Sentiment and Score are missing in data set C

 

Naturally, we're just guessing. We wouldn't have to guess (a much better approach, IMHO) if you showed us parts of data set A (content_) and data set C (reaction_types) that you think should match.

--
Paige Miller
Mariloud
Obsidian | Level 7

Thanks for replying! just realized that Data set A (content_) and data set C (reaction_types) do not have matching values in type as you  mention. It's actually reaction_ and reaction_types that have matching values in type.  Any tips on how to proceed at this point? here is the other part of the dataetIMG_4736.jpg

PaigeMiller
Diamond | Level 26

@Mariloud wrote:

Thanks for replying! just realized that Data set A (content_) and data set C (reaction_types) do not have matching values in type as you  mention. It's actually reaction_ and reaction_types that have matching values in type.  Any tips on how to proceed at this point? here is the other part of the dataet


I guess you need to show us what should match. (And I don't really grasp what this screen capture has to do with the problem)

--
Paige Miller
Reeza
Super User

Variables you're joining on, content_id or type have records in one data set but not the other exclusively. Since you're showing Sentiment and Score that are missing that means that Type is likely not the same in one data set as the other. This could be due to variables not being exactly the same (eg. "001" is not the same as "0001" or "1") or just missing from one data set. 

Mariloud
Obsidian | Level 7

thanks for replying! Data set A (content_) and data set C (reaction_types) do not actually have matching, it's in reaction_ an reaction_types that have matching values in type. And I want to join them all 3.

ballardw
Super User

When you use a construct such as

 

From dataset1 as a

         left join

        dataset2 as b

      on a.variable=b.variable

 

you have requested ALL the records from A and data from B that passes the ON clause. So every record in A that does not have a match in B will have all the variable from B with missing values.

If you do not want that behavior you can either use a different type of Join or add a where clause to require output records to have something present in the B data set possibly something like

  

   Where not missing(b.variable)

 

Depends exactly on what you want in the final output. When you start left joining multiple data sets the conditions may get complicated depending on your join conditions.

Kurt_Bremser
Super User

Maxim 3: Know Your Data.

Check the key values from the resulting observations with missing values and inspect the lookup datasets for the keys you think should match, then decide how to proceed.

Mariloud
Obsidian | Level 7

Thanks for your reply! I am joining 3 dataset and just realized that Data set A (content_) and data set C (reaction_types) do not have matching, only reaction_ and reaction_types do in type.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2917 views
  • 0 likes
  • 5 in conversation