BookmarkSubscribeRSS Feed
kas114
Calcite | Level 5

Hi all! I'm new to SAS (using UE) and am trying to take two datasets, merge them, and then create a new column of calculations. I have one data set (dist) that has 1 row/observation of data (the mean, percentile, and cutprobs) and another data set (brr_runs) that holds 16 rows/observations with nearly identical variables (mean, percentiles, and cutprobs but with "b" in the title to separate them from the initial run). I'd like to merge them and then calculate the squared difference between each estimate in the brr_runs set and the baseline parameters (located in dist). This is what I'm attempting:

 

data distall;
merge dist brr_runs;
array varo (*) mean_mc_t tpercentile1-tpercentile99 cutprob1-cutprob4;
array bvar (*) bmean_mc_t btpercentile1-btpercentile99 bcutprob1-bcutprob4;
array dsqr (*) dbmean_mc_t dbtpercentile1-dbtpercentile99 dbcutprob1-dbcutprob4;
do i=1 to dim(bvar);
dsqr[i]=(bvar[i]-varo[i])**2;
end;
run;

 

However, when I run this it creates a data set "distall" that has mostly missing values. I've tried manipulating it, but have run into the same issue. It also has created "distall" with 17 rows of identical data (as in, it just took one observation from the original datasets and recreated it 16 times).

 

Can anyone give me some guidance on this? Thanks!

2 REPLIES 2
ballardw
Super User

Example data similar to both starting sets and of the desired output for those sets helps provide better examples.

 

MERGE has several behaviors depending on whether you expect to match rows based on a common variable (or group of variables). Is that the case in your data? That would require the use of a BY statement with the variable to match. The data would also have to be sorted prior to the merge by the same variables.

 

Without BY merge tells SAS to put the first row of one set with the first of the other, 2nd to 2nd rows, 3rd to 3rd. If one set has fewer rows then there is no contributed data and hence missing.

 

MERGE seldom does what you want in the case of a one-to-many or many-to-many combination.

If you want to match ALL rows in set 1 with ALL rows in set 2 then what you are seeking is join (another word for combine but with a variety of types) and would be accomplished with Proc SQL. If that is what you want you might consider:

proc sql;
   create table distall as
   select dist.*, brr_runs.*
   from dist, brr_runs;
quit;
/* since arrays do not work in SQL it may be easier to
add the manipulation as*/
data distall;
   set distall;
   array varo (*) mean_mc_t tpercentile1-tpercentile99 cutprob1-cutprob4;
   array bvar (*) bmean_mc_t btpercentile1-btpercentile99 bcutprob1-bcutprob4;
   array dsqr (*) dbmean_mc_t dbtpercentile1-dbtpercentile99 dbcutprob1-dbcutprob4;
   do i=1 to dim(bvar);
      dsqr[i]=(bvar[i]-varo[i])**2;
   end;
run;

If that doesn't do what you need, see my comment about example data.

 

art297
Opal | Level 21

You can do it all in one data step:

data distall;
  array varo (*) mean_mc_t tpercentile1-tpercentile99 cutprob1-cutprob4;
  array bvar (*) bmean_mc_t btpercentile1-btpercentile99 bcutprob1-bcutprob4;
  array dsqr (*) dbmean_mc_t dbtpercentile1-dbtpercentile99 dbcutprob1-dbcutprob4;
  set brr_runs;
  if _n_ eq 1 then set dist;
  do i=1 to dim(bvar);
    dsqr[i]=(bvar[i]-varo[i])**2;
  end;
run;

Art, CEO, AnalystFinder.com

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 682 views
  • 0 likes
  • 3 in conversation