Hello, @Tom wrote the following code so that I could see how my MERGE statement had more than one data set with repeats of BY values: *MERGE statement has more than one data set with repeats of BY values;
proc sort data=paper.CSRP_annual_returns
out=paper.CSRP_annual_returns2
nodupkey;
by ticker;
run;
*Merge CEO data and firm data;
DATA paper.ceo_firm ;
length ticker $5 source $10;
MERGE
paper.CSRP_annual_returns2 (in=in1)
paper.compustat_execucomp4 (in=in2)
;
BY ticker;
if in1 and in2 then source='BOTH';
else if in1 then source='ANNUAL';
else source='EXECCOMP';
RUN; The output includes a variable that tells me which of the two data sets the observation is coming from. It also says which are coming from both. I would like to keep only those rows which have "BOTH" as a source. What is the best way to do this? Here is an example of what the current output looks like: row ticker source . . . . . . . . . . 408 ACO BOTH 2010 9.0442006703 Lawrence E. Washow 1 0 0 0 7456 AMCOL INTERNATIONAL CORP Chief Executive Officer, President, Director and Member of Executive Committee CEO 670 0 0 413.76 136 666.628 3163.88 5050.268 670 5050.268 5547.336 1498.699 4190.768 211.988 0.676 409 ACO BOTH 2011 9.0442006703 Ryan F. McKendrick 1 0 0 0 22282 AMCOL INTERNATIONAL CORP Chief Executive Officer, President and Director CEO 500 0 613.2 342.6 601.786 363.203 35.644 2456.433 500 1843.233 2609.998 1040.818 3000 201.949 0.632 410 ACO BOTH 2012 9.0442006703 Ryan F. McKendrick 1 0 0 0 22282 AMCOL INTERNATIONAL CORP Chief Executive Officer, President and Director CEO 525 0 0 370.8 496 653.54 36.011 2081.351 525 2081.351 1909.228 1090.818 3150 90.357 0.279 411 ACO BOTH 2013 9.0442006703 Ryan F. McKendrick 1 0 0 0 22282 AMCOL INTERNATIONAL CORP Chief Executive Officer, President, Director and Member of Executive Committee CEO 700 517 833.472 508.375 0 420.204 37.615 3016.666 1217 3016.666 1877.932 1436.183 4200 75.084 0.231 412 ACOM ANNUAL 2010 88.344227811 . . . . . . . . . . 413 ACOR BOTH 2010 7.2911181633 Ron Cohen, M.D. 1 0 0 0 46385 ACORDA THERAPEUTICS INC Founder, Chairman, Chief Executive Officer and President CEO 589.75 592.86 3817.344 4549.547 0 0 7.35 9556.851 1182.61 9557.298 5485.461 9276.502 9276.502 1149.837 2.941 414 ACOR BOTH 2011 7.2911181633 Ron Cohen, M.D. 1 0 0 0 46385 ACORDA THERAPEUTICS INC Founder, Chief Executive Officer, President and Director CEO 635 0 965.125 1605.196 557.92 0 7.35 3770.591 635 3770.583 5966.926 9085.989 9911.129 1164.359 2.904 415 ACOR BOTH 2012 7.2911181633 Ron Cohen, M.D. 1 0 0 0 46385 ACORDA THERAPEUTICS INC Founder, Chief Executive Officer, President and Director CEO 690 0 988.125 1590.684 571.2 0 7.35 3847.359 690 3847.425 2598.872 8466.185 9028.265 1309.41 3.251 416 ACOR BOTH 2013 7.2911181633 Ron Cohen, M.D. 1 0 0 0 46385 ACORDA THERAPEUTICS INC Founder, Chief Executive Officer, President and Director CEO 720.833 0 685.35 1976.389 527.8 0 7.65 3918.022 720.833 3918.022 12601.719 7101.215 7681.215 1300.266 3.115 417 ACOR BOTH 2014 7.2911181633 Ron Cohen, M.D. 1 0 0 0 46385 ACORDA THERAPEUTICS INC Founder, Chief Executive Officer, President and Director CEO 745.833 0 700.413 1939.685 651.6 0 7.65 4045.181 745.833 4045.181 3887.288 6710.938 7310.938 1389.888 3.244 418 ACOR BOTH 2015 7.2911181633 Ron Cohen, M.D. 1 0 0 0 46385 ACORDA THERAPEUTICS INC Founder, Chief Executive Officer, President and Director CEO 768.75 0 607.58 1510.794 553.728 0 7.95 3448.802 768.75 3448.802 4685.349 12235.274 12853.274 1236.761 2.68 419 ACOR BOTH 2016 7.2911181633 Ron Cohen, M.D. 1 0 0 0 46385 ACORDA THERAPEUTICS INC Founder, Chief Executive Officer, President and Director CEO 772.5 0 1046.501 1043.884 562.38 0 7.95 3433.215 772.5 3433.215 2660.472 3136.16 3818.432 1422.28 3.065 420 ACPW ANNUAL 2010 94.858409388 . . . . . . . . . . 421 ACRE ANNUAL 2012 0.7119757952 And here is the log: 1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 70 71 *MERGE statement has more than one data set with repeats of BY values; 72 proc sort data=paper.CSRP_annual_returns 73 out=paper.CSRP_annual_returns2 74 nodupkey; 75 by ticker; 76 run; NOTE: There were 53176 observations read from the data set PAPER.CSRP_ANNUAL_RETURNS. NOTE: 42661 observations with duplicate key values were deleted. NOTE: The data set PAPER.CSRP_ANNUAL_RETURNS2 has 10515 observations and 3 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.03 seconds user cpu time 0.01 seconds system cpu time 0.01 seconds memory 6126.06k OS Memory 36452.00k Timestamp 04/01/2018 07:47:46 PM Step Count 61 Switch Count 2 Page Faults 0 Page Reclaims 876 Page Swaps 0 Voluntary Context Switches 46 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 528 77 78 *Merge CEO data and firm data; 79 DATA paper.ceo_firm ; 80 length ticker $5 source $10; 81 MERGE 82 paper.CSRP_annual_returns2 (in=in1) 83 paper.compustat_execucomp4 (in=in2) 84 ; 85 BY ticker; 86 if in1 and in2 then source='BOTH'; 87 else if in1 then source='ANNUAL'; 88 else source='EXECCOMP'; 89 RUN; NOTE: There were 10515 observations read from the data set PAPER.CSRP_ANNUAL_RETURNS2. NOTE: There were 13346 observations read from the data set PAPER.COMPUSTAT_EXECUCOMP4. NOTE: The data set PAPER.CEO_FIRM has 21807 observations and 109 variables. NOTE: DATA statement used (Total process time): real time 0.17 seconds user cpu time 0.02 seconds system cpu time 0.05 seconds memory 4222.00k OS Memory 35504.00k Timestamp 04/01/2018 07:47:47 PM Step Count 62 Switch Count 2 Page Faults 0 Page Reclaims 738 Page Swaps 0 Voluntary Context Switches 1472 Involuntary Context Switches 1 Block Input Operations 544 Block Output Operations 88840 90 91 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 103 Thanks for the help!
... View more