BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jen123
Fluorite | Level 6

Here is my code:

Data file_results;

Merge file_a file_b;

by variable_x;

run;

Issue:  There are 134 columns (variables) in file_a and file_b.  When I merge these two files together, the resultant dataset has 135 variables.  There are 2 varaibles added at the end....and neither are in file_a and file_b.

Any advice is appreciated!
Thanks,
Jenne

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

: Just remove The $ sign.  e.g.: managed_function_L4=put(a_managed_function_L4,5.);

View solution in original post

7 REPLIES 7
art297
Opal | Level 21

I think you will have to post a copy of your log in order for anyone to help.

jen123
Fluorite | Level 6

Here is the log...thanks in advance for any help.

***************************************

1                                                          The SAS System                          21:55 Wednesday, January 16, 2013

1          ;*';*";*/;quit;run;

2          OPTIONS PAGENO=MIN;

3          %LET _CLIENTTASKLABEL='combined_gdw';

4          %LET _CLIENTPROJECTPATH='/ptr/ops/proj/jle/monthly_headcount_walk/monthly_headcount_walk.egp';

5          %LET _CLIENTPROJECTNAME='monthly_headcount_walk.egp';

6          %LET _SASPROGRAMFILE='/ptr/ops/proj/jle/monthly_headcount_walk/combined_gdw.sas';

7         

8          ODS _ALL_ CLOSE;

9          OPTIONS DEV=ACTIVEX;

NOTE: Procedures may not support all options or statements for all devices. For details, see the documentation for each procedure.

10         GOPTIONS XPIXELS=0 YPIXELS=0;

11         FILENAME EGSR TEMP;

12         ODS tagsets.sasreport12(ID=EGSR) FILE=EGSR STYLE=Analysis

12       ! STYLESHEET=(URL="file:///C:/Program%20Files/SAS/EnterpriseGuide/4.3/Styles/Analysis.css") NOGTITLE NOGFOOTNOTE

12       ! GPATH=&sasworklocation ENCODING=UTF8 options(rolap="on");

NOTE: Writing TAGSETS.SASREPORT12(EGSR) Body file: EGSR

13        

14         GOPTIONS ACCESSIBLE;

15         libname jen1 "/ptr/ops/proj/jle/monthly_headcount_walk/gdw_monthend_files/current_month";

NOTE: Libref JEN1 was successfully assigned as follows:

      Engine:        V9

      Physical Name: /ptr/ops/proj/jle/monthly_headcount_walk/gdw_monthend_files/current_month

16         libname jen2 "/ptr/ops/proj/jle/monthly_headcount_walk/gdw_monthend_files/prior_month";

NOTE: Libref JEN2 was successfully assigned as follows:

      Engine:        V9

      Physical Name: /ptr/ops/proj/jle/monthly_headcount_walk/gdw_monthend_files/prior_month

17        

18         /* CURRENT MONTH - Convert variables from numeric to character */

19        

20         data non_employee_headcount_a;

21              set jen1.non_employee_headcount_current1 (rename=(global_function=a_global_function));

22              global_function=put(a_global_function,$4.);

WARNING: Variable a_global_function has already been defined as numeric.

23              drop a_global_function;

24              run;

NOTE: There were 42844 observations read from the data set JEN1.NON_EMPLOYEE_HEADCOUNT_CURRENT1.

NOTE: The data set WORK.NON_EMPLOYEE_HEADCOUNT_A has 42844 observations and 134 variables.

NOTE: Compressing data set WORK.NON_EMPLOYEE_HEADCOUNT_A decreased size by 57.33 percent.

      Compressed is 762 pages; un-compressed would require 1786 pages.

NOTE: DATA statement used (Total process time):

      real time           4.86 seconds

      cpu time            4.25 seconds

     

25        

26         data jen1.non_employee_headcount_b;

27              set non_employee_headcount_a (rename=(managed_function_L4=a_managed_function_L4));

28              managed_function_L4=put(a_managed_function_L4,$5.);

WARNING: Variable a_managed_function_L4 has already been defined as numeric.

29              drop a_managed_function_L4;

30              run;

NOTE: There were 42844 observations read from the data set WORK.NON_EMPLOYEE_HEADCOUNT_A.

NOTE: The data set JEN1.NON_EMPLOYEE_HEADCOUNT_B has 42844 observations and 134 variables.

NOTE: Compressing data set JEN1.NON_EMPLOYEE_HEADCOUNT_B decreased size by 57.39 percent.

      Compressed is 761 pages; un-compressed would require 1786 pages.

NOTE: DATA statement used (Total process time):

2                                                          The SAS System                          21:55 Wednesday, January 16, 2013

      real time           3.83 seconds

      cpu time            3.51 seconds

     

31        

32         /* CURRENT MONTH - Convert variables from character to numeric */

33        

34         data jen1.direct_staff_headcount_b;

35              set jen1.direct_staff_headcount_current1 (rename=(business_contact_id=a_business_contact_id));

36              business_contact_id=input(a_business_contact_id,BEST12.);

37              drop a_business_contact_id;

38              run;

NOTE: There were 62934 observations read from the data set JEN1.DIRECT_STAFF_HEADCOUNT_CURRENT1.

NOTE: The data set JEN1.DIRECT_STAFF_HEADCOUNT_B has 62934 observations and 133 variables.

NOTE: Compressing data set JEN1.DIRECT_STAFF_HEADCOUNT_B decreased size by 54.69 percent.

      Compressed is 1097 pages; un-compressed would require 2421 pages.

NOTE: DATA statement used (Total process time):

      real time           5.92 seconds

      cpu time            5.59 seconds

     

39        

40        

41         /* CURRENT MONTH - Merge/Combine direct_staff_headcount and non_employee_headcount */

42        

43         Data jen1.combined_gdw_current_a;

44        

45         MERGE jen1.direct_staff_headcount_b

46         jen1.non_employee_headcount_b;

47         by GEID;

48        

49         run;

NOTE: There were 62934 observations read from the data set JEN1.DIRECT_STAFF_HEADCOUNT_B.

NOTE: There were 42844 observations read from the data set JEN1.NON_EMPLOYEE_HEADCOUNT_B.

NOTE: The data set JEN1.COMBINED_GDW_CURRENT_A has 105778 observations and 135 variables.

NOTE: Compressing data set JEN1.COMBINED_GDW_CURRENT_A decreased size by 55.35 percent.

      Compressed is 1817 pages; un-compressed would require 4069 pages.

NOTE: DATA statement used (Total process time):

      real time           8.95 seconds

      cpu time            8.76 seconds

     

50        

51         GOPTIONS NOACCESSIBLE;

52         %LET _CLIENTTASKLABEL=;

53         %LET _CLIENTPROJECTPATH=;

54         %LET _CLIENTPROJECTNAME=;

55         %LET _SASPROGRAMFILE=;

56        

57         ;*';*";*/;quit;run;

58         ODS _ALL_ CLOSE;

59        

60        

61         QUIT; RUN;

62        

                                                     

art297
Opal | Level 21

: You used a merge, but none of the GEID values matched across the two files.  As such, the resulting file contained one record for each record that existed in the two files you tried to merge.

It would have been more appropriate to combine them with a set statement.

One of the extra fields had to come from WORK.NON_EMPLOYEE_HEADCOUNT_A, as it had 134 variables.  I would think that there is one other variable in JEN1.DIRECT_STAFF_HEADCOUNT_B that doesn't have a match in the other file, thus accounting for the 135th variable that was added to the file.

Also, in your two early datasteps, you use put statements to convert a numeric variable, but use an character format.  You will eliminate the warning if you change those to represent numeric formats (e.g., 3.).

jen123
Fluorite | Level 6

I had use SET statement first and got the same result.  So I thought I used an incorrect stmt so I tried MERGE.  If I have to combine two files where the rows are unique, then is it best to use SET instead of MERGE?

RE:  "Also, in your two early datasteps, you use put statements to convert a numeric variable, but use an character format.  You will eliminate the warning if you change those to represent numeric formats (e.g., 3.)."   The original variable is NUMERIC and I want to convert it to CHARACTER, that is why I used a PUT stmt and character format.  Did I do this incorrectly?  What should I use if I want to convert the NUMERIC variable to CHARACTER?

Thanks for the help Arther!

art297
Opal | Level 21

: Just remove The $ sign.  e.g.: managed_function_L4=put(a_managed_function_L4,5.);

jen123
Fluorite | Level 6

Thank you for the help Arthur!

art297
Opal | Level 21

: Also, you could use something like the following to find out which variables were only in one of the files:

proc sql;

  create table differences as

    select name, count(*) as count

      from dictionary.columns

        where libname="JEN" and

              (memname="DIRECT_STAFF_HEADCOUNT_B" or

               memname="NON_EMPLOYEE_HEADCOUNT_B")

          group by name

            having count(*) eq 1

  ;

quit;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1161 views
  • 0 likes
  • 2 in conversation