BookmarkSubscribeRSS Feed
franriv
Obsidian | Level 7
proc sort data=prolease&perishort. nodupkey;
by operacion database;
run; proc sort data=ext_new nodupkey dupout=check;
by operacion database;
run; data prolease&perishort.;
modify prolease&perishort. ext_new;
by operacion database;
run;

 

 

 

NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR

22

23 GOPTIONS ACCESSIBLE;

SYMBOLGEN: Macro variable PERISHORT resolves to 1812

24

25 data prolease&perishort.;

26

SYMBOLGEN: Macro variable PERISHORT resolves to 1812

26 ! modify prolease&perishort. ext_new;

27 by operacion database;

28 run;

NOTE: The DATA step has been abnormally terminated.

NOTE: There were 1 observations read from the data set WORK.PROLEASE1812.

NOTE: The data set WORK.PROLEASE1812 has been updated. There were 2492 observations rewritten, 0 observations added and 0

observations deleted.

NOTE: There were 2493 observations read from the data set WORK.EXT_NEW.

NOTE: DATA statement used (Total process time):

real time 1:08.18

cpu time 1:07.47

 

 

Is not working as expected. I expect it to produce the same result as MERGE statement without overwriting with missing values.

 

MERGE statement does work:

23 GOPTIONS ACCESSIBLE;

SYMBOLGEN: Macro variable PERISHORT resolves to 1812

24 data prolease&perishort.;

25

SYMBOLGEN: Macro variable PERISHORT resolves to 1812

25 ! merge prolease&perishort. ext_new;

26 by operacion database;

27 run;

NOTE: There were 347852 observations read from the data set WORK.PROLEASE1812.

NOTE: There were 11066 observations read from the data set WORK.EXT_NEW.

NOTE: The data set WORK.PROLEASE1812 has 347853 observations and 177 variables.

NOTE: DATA statement used (Total process time):

real time 2.91 seconds

cpu time 2.57 seconds

 

MODIFY without transaction dataset works:

23 GOPTIONS ACCESSIBLE;

SYMBOLGEN: Macro variable PERISHORT resolves to 1812

24 data prolease&perishort.;

25

SYMBOLGEN: Macro variable PERISHORT resolves to 1812

25 ! modify prolease&perishort.;

26 extrest='test';

27 run;

NOTE: There were 347853 observations read from the data set WORK.PROLEASE1812.

NOTE: The data set WORK.PROLEASE1812 has been updated. There were 347853 observations rewritten, 0 observations added and 0

observations deleted.

NOTE: DATA statement used (Total process time):

real time 1.06 seconds

cpu time 1.06 seconds

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

Please be more specific in your requirements.. Was all you did to change the word 'merge' to 'modify'? Can you provide some example data that shows what WORK.PROLEASE1812 looks like?

franriv
Obsidian | Level 7

after running:

data test;
	set prolease1812(keep=operacion operorigen database);
run;
proc sort data=test nodup; by _ALL_;
proc sort data=test nodupkey; by operacion database;

data test;
	modify test ext_new;
	by operacion database;
run;

(SAS block code in this web doesn't seem to preserve my indenting).

 

 

my tables look like this:

 

testextnew.jpg

 

 

proc contents data=test;
proc contents data=ext_new;

 

 

 

The CONTENTS Procedure

 

 

Data Set Name

WORK.TEST

Observations

347852

Member Type

DATA

Variables

3

Engine

V9

Indexes

0

Created

01/11/2019 09:13:44

Observation Length

44

Last Modified

01/11/2019 09:16:21

Deleted Observations

0

Protection

 

Compressed

NO

Data Set Type

 

Sorted

YES

Label

 

 

 

Data Representation

SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64

 

 

Encoding

latin1 Western (ISO)

 

 

 

 

Engine/Host Dependent Information

Data Set Page Size

65536

Number of Data Set Pages

235

First Data Page

1

Max Obs per Page

1484

Obs in First Data Page

1435

Number of Data Set Repairs

0

Filename

***

Release Created

9.0401M4

Host Created

Linux

Inode Number

18612318

Access Permission

rw-rw-r--

Owner Name

franriv

File Size

15MB

File Size (bytes)

15466496

 

 

Alphabetic List of Variables and Attributes

#

Variable

Type

Len

2

DataBase

Char

4

3

OperOrigen

Char

20

1

Operacion

Char

20

 

 

Sort Information

Sortedby

Operacion DataBase

Validated

YES

Character Set

ASCII

Sort Option

NODUPKEY

 

 

The CONTENTS Procedure

 

 

Data Set Name

WORK.EXT_NEW

Observations

11066

Member Type

DATA

Variables

3

Engine

V9

Indexes

0

Created

01/11/2019 09:09:13

Observation Length

44

Last Modified

01/11/2019 09:09:13

Deleted Observations

0

Protection

 

Compressed

NO

Data Set Type

 

Sorted

YES

Label

 

 

 

Data Representation

SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64

 

 

Encoding

latin1 Western (ISO)

 

 

 

 

Engine/Host Dependent Information

Data Set Page Size

65536

Number of Data Set Pages

8

First Data Page

1

Max Obs per Page

1484

Obs in First Data Page

1434

Number of Data Set Repairs

0

Filename

***

Release Created

9.0401M4

Host Created

Linux

Inode Number

18614142

Access Permission

rw-rw-r--

Owner Name

franriv

File Size

576KB

File Size (bytes)

589824

 

 

Alphabetic List of Variables and Attributes

#

Variable

Type

Len

Format

Informat

Label

3

database

Char

4

 

 

 

1

operacion

Char

20

$11.

$11.

OperDest

2

operorigen

Char

20

$11.

$11.

OperOri

 

 

Sort Information

Sortedby

operacion database

Validated

YES

Character Set

ASCII

Sort Option

NODUPKEY

 

 

This is just me testing to learn how to use the MODIFY statement; I am expecting the same output as a merge statement (provided operorigen is not null for any entry in the ext_new table), but faster execution.

 

merge statement does work

NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR

22

23 GOPTIONS ACCESSIBLE;

24 data test2;

25 merge test ext_new;

26 by operacion database;

27 run;

NOTE: There were 347852 observations read from the data set WORK.TEST.

NOTE: There were 11066 observations read from the data set WORK.EXT_NEW.

NOTE: The data set WORK.TEST2 has 347853 observations and 3 variables.

NOTE: DATA statement used (Total process time):

real time 0.27 seconds

cpu time 0.26 seconds

 

but modify statement won't do it;

NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR

22

23 GOPTIONS ACCESSIBLE;

24

25 data test;

26 modify test ext_new;

27 by operacion database;

28 run;

NOTE: The DATA step has been abnormally terminated.

NOTE: There were 1 observations read from the data set WORK.TEST.

NOTE: The data set WORK.TEST has been updated. There were 6563 observations rewritten, 0 observations added and 0 observations

deleted.

NOTE: There were 6564 observations read from the data set WORK.EXT_NEW.

NOTE: DATA statement used (Total process time):

real time 2:32.11

cpu time 2:30.86

 

 

ChrisNZ
Tourmaline | Level 20

> SAS block code in this web doesn't seem to preserve my indenting

 

That's the main reason to use spaces and not tabs for indentation: the code layout is safe.

There's a setting in EG for this purpose too.

 

ChrisNZ
Tourmaline | Level 20

1. Results

Here is an example:

data T1;
  A=1; B=1; output;
  A=2; B=2; output;
  A=3; B=3; output;
run;
data T2;
  A=1; B=0; output;
  A=2; B=.; output;
run;
    
data T3;
  merge T1 T2;
  by A;
run;
data T1;
  modify T1 T2;
  by A;
run;

and the result:

A B
1 0
2 .
3 3

 

A B
1 0
2 2
3 3

What is surprising?

 

2. Speed

MODIFY is much slower than MERGE.

The reason is that instead of bulk-building a new table, each record is painstakingly updated one at a time.

For example:

data T1 T2; 
  do A=1 to 1e4;
    B=A;
    output; 
  end;
run;

data T3;
  merge T1 T2;
  by A;
run;

data T1;
  modify T1 T2;
  by A;
run;

34 data T3;
35 merge T1 T2;
36 by A;
37 run;
NOTE: There were 10000 observations read from the data set WORK.T1.
NOTE: There were 10000 observations read from the data set WORK.T2.
NOTE: The data set WORK.T3 has 10000 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
user cpu time 0.00 seconds
system cpu time 0.01 seconds
memory 1098.56k
OS Memory 17760.00k
Timestamp 01/15/2019 11:44:54 AM
Step Count 32 Switch Count 42


39 data T1;
40 modify T1 T2;
41 by A;
42 run;

NOTE: There were 1 observations read from the data set WORK.T1.
NOTE: The data set WORK.T1 has been updated. There were 10000 observations rewritten, 0 observations added and 0 observations
deleted.
NOTE: There were 10000 observations read from the data set WORK.T2.
NOTE: DATA statement used (Total process time):
real time 20.06 seconds
user cpu time 14.54 seconds
system cpu time 2.84 seconds
memory 1009.00k
OS Memory 17500.00k
Timestamp 01/15/2019 11:45:14 AM
Step Count 33 Switch Count 87

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 958 views
  • 1 like
  • 3 in conversation