Solved: Re: Combine dataset question?

ybz12003 · Posted 02-18-2022 11:34 AM

Hello,

I am trying to find a way to combine two datasets efficiently.

Dataset A contains around 1000 variables (Numeric and Characters) and 20000 obs.
Dataset B contains around 50 variables (Numeric and Characters) and 5000 obs.

Both A and B have the same 5000 IDs. B has 49 columns with the same names as A, and one extra 'Record_updated' column. In addition, B has all the updated records in those 49 columns. I would like to combine A and B with a result, that contains 1001 columns and 20000 obs, including 49 updated B dataset columns, and a new 'Record_updated' column. Please advise how to approach it, thanks.

ballardw · Posted 02-18-2022 12:47 PM

Here is an example of a data step update with data set you should have that changes the values of some of the variables and adds an "updated" column, which you didn't bother to describe very well.

data work.base;
   set sashelp.class;
run;

data work.update;
   input name :$8. age height updated;
datalines;
Alfred   16  73 1
Jane     14  60.5 1
;

data new;
   update work.base
          work.update
   ;
   by name;
run;

Limitations on this approach:

The base data set can have exactly 1 record with the by variable value(s).

Both the base data set, work.class in this case, and the update set must be sorted in the order of the by variables as usual for the by to work.

Any new variables in the update set are added to the result. By default, missing values in the Update set do not replace non-missing values in the base data set. If you need that behavior use the option UPDATEMODE=NOMISSINGCHECK on the update statement.

If you data does not match the description above then you need to provide a better description of the data and possibly the expected result.

View solution in original post

ballardw · Posted 02-18-2022 11:39 AM

Data step or sql update

PaigeMiller · Posted 02-18-2022 11:41 AM

Both A and B have the same 5000 IDs.

How is this combine supposed to work? It sounds like you want a many-to-one combine based upon ID, but you haven't actually stated that. Its an assumption and I'd like to know if it is a correct assumption.

--
Paige Miller

ybz12003 · Posted 02-18-2022 12:06 PM

The A's 20000 IDs have B's 5000 IDs.

PaigeMiller · Posted 02-18-2022 12:49 PM

@ybz12003 wrote:
The A's 20000 IDs have B's 5000 IDs.

You have not answered my question. How is this "combine" supposed to happen? Is it many-to-one, or something else? Be specific. Provide details.

--
Paige Miller

ybz12003 · Posted 02-18-2022 12:12 PM

I expect the final combined dataset C = 20000 IDs (A) = 5000 IDs (B) + 15000 extra IDs

ybz12003 · Posted 02-18-2022 12:16 PM

Also,
the final combined dataset C = 1001 columns = 1950 old columns (A) + 49 updated infor. columns (B) + 1 'Record_updated' column (B)

ballardw · Posted 02-18-2022 12:47 PM

Here is an example of a data step update with data set you should have that changes the values of some of the variables and adds an "updated" column, which you didn't bother to describe very well.

data work.base;
   set sashelp.class;
run;

data work.update;
   input name :$8. age height updated;
datalines;
Alfred   16  73 1
Jane     14  60.5 1
;

data new;
   update work.base
          work.update
   ;
   by name;
run;

Limitations on this approach:

The base data set can have exactly 1 record with the by variable value(s).

Both the base data set, work.class in this case, and the update set must be sorted in the order of the by variables as usual for the by to work.

Any new variables in the update set are added to the result. By default, missing values in the Update set do not replace non-missing values in the base data set. If you need that behavior use the option UPDATEMODE=NOMISSINGCHECK on the update statement.

If you data does not match the description above then you need to provide a better description of the data and possibly the expected result.

ybz12003 · Posted 02-23-2022 12:47 PM

I didn't use your code but would like to try the 'Update' statement to compare mine to see if there is any difference. Although my codes had ten more steps, I feel safer getting a dataset I want.

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away