Unable to update more than one fields using PROC SQL:

Reply
Contributor
Posts: 20

Unable to update more than one fields using PROC SQL:

Hi,

I am working in a DYL280 to SAS conversion project. I am facing few issues to complete this task. Please help me to overcome this. Thanks much in advance.

Requirement is:

First lookup to select NAME and loc:

Need to do a lookup between file1 and file2, whereas name was the key field. IF found between two files then update file1 by moving TLOC(in file2) TO ZIPCODE(in file1).

If not found then list 'NAME NOT FOUND' NAME in sas log(Note: value in the name field for that unmatched record should be written in the saslog as well) and reject that record .

And drop all unmatched records and update file1 with only matched records.

DYL280 code:

****SELECT NAME****

BINSEARCH FILE2 NAME

IFFOUND EQ 'Y' NEXT

ELSE LIST 'NAME NOT FOUND' NAME

GOTO NEXTRECORD ENDIF

MOVE TLOC TO ZIPCODE

Second lookup between file1 and file3, where as DEPT was the key field. IF found between two files AND TAREA1 NOT EQUAL TO 'AYA' then update file1 by moving TAREA2(in file2) TO TAREA(in file1) and AREADESC(in file2) TO LDESC(in file1).

If not found and TAREA1 EQ 'AYA'  then list 'DEPT NOT FOUND IN FILE3' DEPT in sas log.And drop all unmatched records and update file1 with only matched records.

****REJECT ANY DEPT CCS, GET CC DECS****

BINSEARCH FILE3 DEPT

IFFOUND EQ 'Y' NEXT

ELSE LIST 'DEPT NOT FOUND IN FILE3' DEPT

GOTO NEXTRECORD ENDIF

IF TAREA1 EQ 'AYA' GOTO READMAST ENDIF

MOVE TAREA2 TO TAREA

MOVE AREADESC TO LDESC

SAS code we tried for first logic: But unable to drop unmatched records and update more than one variable at a time. And unable to list down the nonmatched field value in SAS log.

proc sql;

   update file1 as A

   set zipcode = (select tloc from file2 as B

                       where b.Tname = A.Name)

  ;

quit;

Input file layout and sample records:

data file1;

input

@01 Name               $12.

@14 zipcode             05.

@20 Area1              $03.

@23 dept                04.

@28 ldper               02.

@30 ldesc               30.;

datalines;

JohnAbraham  80000 AY 1900 09

Madisson     75200 AY 1825 01

Murphy       95210 AY 1952 06

Donaldmathew 65214 NY 6854 18

Eric         85102 NA 7889 24

;

run;

data File2;

input

@01 TName           $12.

@13 Tloc            05.

@18 tarea1         $03.

@21 tdept1          04.;

datalines;

Mathews55589 LA 1980 10

Jonny  75200 LA 1825 01

David  87510 HT 1952 06

Edrick 65214 LO 6854 18

Monaay 74502 NA 7889 24

;

run;

data File3;

input

@01 Tloc1            05.

@07 tarea2         $03.

@10 tdept2          04.

@14 areadesc       30.;

datalines;

55589 LA 1980 10 SOUTHEREN CALIFORNIA

75200 LA 1825 01

87510 HT 1952 06 EASTERAN BAY

65214 LO 6854 18 NEWYORK

74502 NA 7889 24 WASHINGTON.;

RUN;

Occasional Contributor
Posts: 13

Re: Unable to update more than one fields using PROC SQL:

Posted in reply to Santhoshcsc

Here is one idea rather than the proc sql subquery update method you have above:

proc sql;

create table file_out as

select a.Name,

          case when b.tloc ^= . then b.tloc else a.zipcode end as zipcode,

          case when c.tedpt2 ^= . and b.tarea1 ^= 'AYA' then c.tarea2 else a.area1 end as area1,

          a.dept,

          a.ldper,

          case when c.tdept2 ^= . and b.tarea1 ^= 'AYA' then c.areadesc else a.ldesc end as ldesc,

          case when b.tloc = . then 1 else 0 end as miss_ind_21,

          case when c.tdept2 = . then 1 else 0 end as miss_ind_32

from file1 a left join file2 b on a.name = b.tname

                   left join file3 c on a.dept = c.tdept2;

data final_file(drop=op miss_ind_21 miss_ind_32);

     set file_out;

     length op $70;

     if miss_ind_21 = 1 then do;

           op = 'NAME NOT FOUND: '||name;

           put op;

           delete;

     end;

     if miss_ind_32 = 1 then do;

          op = 'DEPT NOT FOUND IN FILE3 '||dept;

          put op;

          delete;

     end;

run;

Respected Advisor
Posts: 3,156

Re: Unable to update more than one fields using PROC SQL:

Posted in reply to Santhoshcsc

This is a possible solution for your first lookup, the second one is similar and you are encouraged to give a shot yourself.

proc sort data=file1;

by name;

run;

proc sort data=file2;

by name;

run;

data file1;

merge file1(in=f1) file2(keep=name Tloc rename=(Tloc=zipcode) in=f2);

by name;

if not f2 then put 'NAME NOT FOUND: ' name;

if f1 and f2;

run;

Good Luck,

Haikuo

Contributor
Posts: 20

Re: Unable to update more than one fields using PROC SQL:

Hi,

I forgot to tell this. Records should not be sorted, since that file will be used by other projects as well. That's why I didn't tried merge concept.

Thank you.

Santhosh.

Respected Advisor
Posts: 3,156

Re: Unable to update more than one fields using PROC SQL:

Posted in reply to Santhoshcsc

If the original order needs to be kept, the SQL is the LAST thing you want to use. SQL only has two types of output: 1. Sorted (defined by 'order by'). 2.God knows what. I will lay out some idea, it may need some reading/learning, but it can be easily implemented. Take your first lookup for instance:

1. Use file1 to start a data step

2. upload needed components from file2 to a Hash table, use 'name' as key.

3. When lookup failed, print error log, else output to a new file1.

You certainly can upload file3 to another Hash table and finish both lookup in one pass.

The hash method will require your computer RAM is large enough to host either file2 or file3 or both if one pass. Although lookup can be done between many to many, it may not give your desired results, so file2/3 usually need to be unique on 'name'.

Haikuo

Contributor
Posts: 20

Re: Unable to update more than one fields using PROC SQL:

Posted in reply to Santhoshcsc

Hi Sonywell,

Thank you so much. I gave a try with your logic, but it didn't went well.Getting some unmatched records as well in the output.

Requirement is we need to do a lookup between two files using a key field, if found then move file2 value to file 1.(Here two fields values in file1 should be updated with file2 field if key matches) Else move 'DEPT NOT FOUND IN FILE3 '||dept in sas log.

Main thing is we should not sort the files, so as you mentioned we can use PROC SQL only for this task. But unfortunately its not working fine. Please guide me.

Super User
Super User
Posts: 7,039

Re: Unable to update more than one fields using PROC SQL:

Posted in reply to Santhoshcsc

It sounds like you have a master file (FILE1) and two transactions files (FILE2 and FILE3).

You should merge FILE1 with FILE2 using NAME to see which records in FILE2 are found in FILE1.  You can then use the matched records to update and the unmatched to report.

data update1 ;

  merge file2 (in=in1) file2 (keep=name in=in2);

  by name ;

if in1 ;

if not in2 then put name= :$quote. 'not found in FILE1.' ;

else output;

run;

proc sql noprint ;

  update FILE1 set ZIPCODE = (select TLOC from update1 where FILE1.name = update1.name)

  where name in (select name from update1)

  ;

quit;

Contributor
Posts: 20

Re: Unable to update more than one fields using PROC SQL:

Yes I can use merge concept, but in the requirement i should not sort the masterf file. Can we merge two files without sorting it?

Trusted Advisor
Posts: 3,211

Re: Unable to update more than one fields using PROC SQL:

Posted in reply to Santhoshcsc

I do not get the concepts of DYL-280. It looks to the question on file-record processing.
The @ usage in the samples is looking for me to be  on mainframe fixed records processing. I that correct?

  The remark as the files are used for other projects is pointing in that direction.is


In that case use data-step views to describe the records so you can process them afterwards.    This will avoid as much of the IO as possible.
A direct search with proc SQL on a datastep view can behave rather well.

As the SAS approach is normally updating complete tables I want to know about that binary search with DYL-280 and updating one record.
What are all used techniques used in for us understandable words? 

---->-- ja karman --<-----
Contributor
Posts: 20

Re: Unable to update more than one fields using PROC SQL:

Posted in reply to Santhoshcsc

Hi Jaap,

Bin search means in DYL280 means as follows, Need to do lookup between two flat files. If key is found then need to move field2 value to field1 value in File1, else need towrite hardcoded value into SAS log and process next records.

For example:

file1:

NAME

AREA

ZIPCODE

File2:

Name2

ARE2

ZIPCODE2.

Key field is NAME;

IF NAME field was found between two files then update AREA field in file1 with value in AREA2 field, else Write :Value not found " into SASLOG. and goto next record.

Note: Records in file2 should not be appened to file1 after lookup processing.

Please let me know, if I didn't make it clear.

Thank you

Trusted Advisor
Posts: 3,211

Re: Unable to update more than one fields using PROC SQL:

Posted in reply to Santhoshcsc

Ok reading and processing flat files not databases or sas-datasets. That is an important pre-req.

Processing a flat file (searching) is possible when you define a view to that SAS(R) 9.4 Language Reference: Concepts, Fifth Edition

Updating flat-files in place (record oriented) is possible. 24782 - Update an external file in place

Dropping records however is not possible with flat file processing. You can read it all and create a new one.  PDSE or using a GDG - batch processing.

What environment are you?

Is that a/ Mainframe and which b/ sas release you are using c/ using classical bound data set or a HFS approach?

What is the sizing of those flat files (no records/vars) , should it run interactively or only in batch

When rewriting the flat files and performance is not a problem you can read/convert them to SAS sort as you like with an ordering indicator (making  _n_  permanent) reorder back (sort) before writing out. That is the most easy understandable coding.

---->-- ja karman --<-----
Ask a Question
Discussion stats
  • 10 replies
  • 430 views
  • 0 likes
  • 5 in conversation