BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Nietzsche
Lapis Lazuli | Level 10

 

Hi I am reading page 181 of the official specialist prep guide on Using One-to-One Reading to Combine Data Sets

 

Basic I have two data sets, one is 11 obs and the other is 9 obs and I combine them with a simple data step.

 

data work.one2one;
set cert.patients;
set cert.measure;
run;

the result is the one on the right hand side.

Nietzsche_0-1668204144992.png

 

As you can see, the ID column in the second data set overrides the ID column in the first data set in the PDV, giving an output with wrong combined observation data.

So what is the point of one on one reading in data manipulation?

 

I have attached the two data sets.

SAS Base Programming (2022 Dec), Preparing for SAS Advanced Programming (Cancelled).
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

I don't think that they really care about what is in the data, they are just using the two datasets to illustrate the point that a data step will stop when it reads past the end of an input.

 

Most SAS data steps do not end at the last line of the code in the data step.  Instead they end when they read past the end of the input date (either an input dataset in a SET/MERGE/UPDATE statement or an input file in an INPUT statement).

 

Try it yourself to see:

data test;
  put 'BEFORE the SET statement ' _n_=  age= ;
  set sashelp.class ;
  put 'AFTER the SET statement ' _n_= age=;
run;

 

But the fact that the two datasets have variables in common is another important thing to learn.  The value that is available in the data step is the last value read in.  So if you change the order of the two SET statements the output will have the same number of observations, but the values in each observation might be different because the data was read in a different order.

View solution in original post

1 REPLY 1
Tom
Super User Tom
Super User

I don't think that they really care about what is in the data, they are just using the two datasets to illustrate the point that a data step will stop when it reads past the end of an input.

 

Most SAS data steps do not end at the last line of the code in the data step.  Instead they end when they read past the end of the input date (either an input dataset in a SET/MERGE/UPDATE statement or an input file in an INPUT statement).

 

Try it yourself to see:

data test;
  put 'BEFORE the SET statement ' _n_=  age= ;
  set sashelp.class ;
  put 'AFTER the SET statement ' _n_= age=;
run;

 

But the fact that the two datasets have variables in common is another important thing to learn.  The value that is available in the data step is the last value read in.  So if you change the order of the two SET statements the output will have the same number of observations, but the values in each observation might be different because the data was read in a different order.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 530 views
  • 2 likes
  • 2 in conversation