Help using Base SAS procedures

Value not in original file appears in merged file. How?

Reply
Contributor
Posts: 32

Value not in original file appears in merged file. How?

I am perplexed to find a 'value' of a variable by ID (Lab_) in the merged file that does not exist in the original file.

In the merged file abc below, I see the value for a variable by a specific Lab_ that was not seen in the file4.

More precisely, the Lab_ 123213 did not show any value in file4 but now showing value in the merged file abc. I am wondering how to fix it.

Any suggetions would be highy useful. Thanks in advance.

data abc;

merge file4 (in=x) file1 file 2;

by Lab_;

in x;

run;

Sanjeeb

Trusted Advisor
Posts: 1,228

Re: Value not in original file appears in merged file. How?

Hi,

Not sure what is the main problem but see below what I found in your syntax does not look correct.

in x;  it should be  if x;

secondly file 2 is file2 right?

Thanks,

Naeem

Contributor
Posts: 32

Re: Value not in original file appears in merged file. How?

Thanks. Yes those were typos. When I wrote the code in this post to illustrate the example I miswrote those --yes, I did if x; and yes, file2.

But the main problem is that the merged file is showing a value by 'Lab_' which is not found in the original file.

I was wondering what could be the possibility for a value to appear in a merged file when by a specific Lab_ when such is not present in the original file. Kindly suggest. Thanks in advance.

Super User
Posts: 19,783

Re: Value not in original file appears in merged file. How?

Please post the exact code used.

Contributor
Posts: 32

Re: Value not in original file appears in merged file. How?

Hi Reeza,

Please find the codes I used as attached. My issue is that the yellow highlighted Lab_ 1556686 (in the attached code) doe not have a value for Analyte = "Total VOCs as Toluene" in the original file exclu4, but is showing a value in "data Toltole" for Analyte ="Total VOCs as Toluene". I must have done some error but do not know what. Thanks in advance for your help.

Sanjeeb

Attachment
Super User
Posts: 19,783

Re: Value not in original file appears in merged file. How?

This is expected SAS behaviour, assuming that one of your other data sets also has the variable Analyte, it will overwrite the variable in the first file. 

If you're merging files make sure the variables in each file, besides the merge variables, are unique.

Super User
Posts: 7,771

Re: Value not in original file appears in merged file. How?

I think that SAS should at least give a note when this happens.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 19,783

Re: Value not in original file appears in merged file. How?

Posted in reply to KurtBremser

It would be nice. Honestly I use SQL for all my merges.

Super User
Posts: 7,771

Re: Value not in original file appears in merged file. How?

Alas, the performance of SAS's SQL when doing large table joins has proven so inferior to proc sort/data step sequences (epsecially when several are running concurrently) that we only use SQL where absolutely required (eg cartesian join).

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 32

Re: Value not in original file appears in merged file. How?

Posted in reply to KurtBremser

Yes, I too was expecting a note or a suggestions from SAS and was wondering why it did not provide.

Contributor
Posts: 32

Re: Value not in original file appears in merged file. How?

Thank you for this.

PROC Star
Posts: 7,471

Re: Value not in original file appears in merged file. How?

You didn't ask to include it.  You stipulated "in x;", rather than what you wanted, namely:

if x;

Trusted Advisor
Posts: 3,212

Re: Value not in original file appears in merged file. How?

I see 1556686 in the datasets exclutwo and in exclu4. 
There are differences with SQL and SAS to know (SQL is designed for a DBMS goal administration.) those are:

- The missing concept of statistics is missing in SQL. it is present in SAS. Notice you can many values in SAS being classified as missing.

- The join/merge is different:

   + SAS dataset join/merge will propagate values in the coalesce way updating missings automatically in the resulting datasets combining all information

   + When an ID is having missing values SAS will not drop those handle them is being something to do with, but SQL is dropping them.

   + SQL is possible generating a Cartesian-product with SAS join merge that is more difficult.
      When you the message that Cartesian product has ben used... Get on your alarm bells when  you are having performance/big data relations


Kurt mentioned already the big data  performance issue. With SQL data is thought not to be ordered. It is possible to get those results random.

You could parallelize some processing that is done with threading. Most people have problems to think like this.
With SAS you are thinking the data is ordered and you process them ordered (the automatic OBS). Most people are thinking like this.       

Your result is exactly what you have code as should being to happen.

The in selection is only a selection should come from this source, all other rules as described are applicable (the missings being replaces out of the other dataset)

---->-- ja karman --<-----
Contributor
Posts: 32

Re: Value not in original file appears in merged file. How?

Thank you very much for taking a detail look. I will try this and see.

Ask a Question
Discussion stats
  • 13 replies
  • 434 views
  • 0 likes
  • 6 in conversation