BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
TheNovice
Quartz | Level 8

Hi all,

 

I have daily sas dataset that i need to combine into 1.

Naming convention is: TRMS_20210421

 

I tried: 

 

DATA Test;
MERGE TTT.TRMS:;
RUN;

 

I get the output like below but the final output is not all the records... What could be the problem?

 

I even tried saving it into my personal sas folder which has higher space allowance.

Is there a better way to do this?


NOTE: There were 12085 observations read from the data set TTT.TRMS_20210411.
NOTE: There were 9076 observations read from the data set TTT.TRMS_20210412.
NOTE: There were 51500 observations read from the data set TTT.TRMS_20210413.
NOTE: There were 47332 observations read from the data set TTT.TRMS_20210414.
NOTE: There were 33850 observations read from the data set TTT.TRMS_20210415.
NOTE: There were 35832 observations read from the data set TTT.TRMS_20210416.
NOTE: There were 42361 observations read from the data set TTT.TRMS_20210417.
NOTE: There were 10540 observations read from the data set TTT.TRMS_20210418.
NOTE: There were 7956 observations read from the data set TTT.TRMS_20210419.
NOTE: There were 60246 observations read from the data set TTT.TRMS_20210420.
NOTE: There were 52694 observations read from the data set TTT.TRMS_20210421.
NOTE: There were 37624 observations read from the data set TTT.TRMS_20210422.
NOTE: There were 31465 observations read from the data set TTT.TRMS_20210423.
NOTE: There were 33209 observations read from the data set TTT.TRMS_20210424.
NOTE: There were 9406 observations read from the data set TTT.TRMS_20210425.
NOTE: There were 6829 observations read from the data set TTT.TRMS_20210426.

 

NOTE: The data set Test has 83573 observations and 4 variables.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

@TheNovice wrote:

Hi all,

 

I have daily sas dataset that i need to combine into 1.

Naming convention is: TRMS_20210421

 

I tried: 

 

DATA Test;
MERGE TTT.TRMS:;
RUN;

 

I get the output like below but the final output is not all the records... What could be the problem?

 

I even tried saving it into my personal sas folder which has higher space allowance.

Is there a better way to do this?


NOTE: There were 12085 observations read from the data set TTT.TRMS_20210411.
NOTE: There were 9076 observations read from the data set TTT.TRMS_20210412.
NOTE: There were 51500 observations read from the data set TTT.TRMS_20210413.
NOTE: There were 47332 observations read from the data set TTT.TRMS_20210414.
NOTE: There were 33850 observations read from the data set TTT.TRMS_20210415.
NOTE: There were 35832 observations read from the data set TTT.TRMS_20210416.
NOTE: There were 42361 observations read from the data set TTT.TRMS_20210417.
NOTE: There were 10540 observations read from the data set TTT.TRMS_20210418.
NOTE: There were 7956 observations read from the data set TTT.TRMS_20210419.
NOTE: There were 60246 observations read from the data set TTT.TRMS_20210420.
NOTE: There were 52694 observations read from the data set TTT.TRMS_20210421.
NOTE: There were 37624 observations read from the data set TTT.TRMS_20210422.
NOTE: There were 31465 observations read from the data set TTT.TRMS_20210423.
NOTE: There were 33209 observations read from the data set TTT.TRMS_20210424.
NOTE: There were 9406 observations read from the data set TTT.TRMS_20210425.
NOTE: There were 6829 observations read from the data set TTT.TRMS_20210426.

 

NOTE: The data set Test has 83573 observations and 4 variables.

 

 


You did not run out of space.  You produced only 83,573 observations because you used MERGE instead of SET.   Merge joins the first obs from each dataset into a single obs, then the second obs into each dataset into one output obs, etc, while a SET statement would concatenate observations.

 

However, I am surprised that you got 83,573.  It should have been equal to the largest input data set (looks like 60,246 obs from 20210420).

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

6 REPLIES 6
mkeintz
PROC Star

@TheNovice wrote:

Hi all,

 

I have daily sas dataset that i need to combine into 1.

Naming convention is: TRMS_20210421

 

I tried: 

 

DATA Test;
MERGE TTT.TRMS:;
RUN;

 

I get the output like below but the final output is not all the records... What could be the problem?

 

I even tried saving it into my personal sas folder which has higher space allowance.

Is there a better way to do this?


NOTE: There were 12085 observations read from the data set TTT.TRMS_20210411.
NOTE: There were 9076 observations read from the data set TTT.TRMS_20210412.
NOTE: There were 51500 observations read from the data set TTT.TRMS_20210413.
NOTE: There were 47332 observations read from the data set TTT.TRMS_20210414.
NOTE: There were 33850 observations read from the data set TTT.TRMS_20210415.
NOTE: There were 35832 observations read from the data set TTT.TRMS_20210416.
NOTE: There were 42361 observations read from the data set TTT.TRMS_20210417.
NOTE: There were 10540 observations read from the data set TTT.TRMS_20210418.
NOTE: There were 7956 observations read from the data set TTT.TRMS_20210419.
NOTE: There were 60246 observations read from the data set TTT.TRMS_20210420.
NOTE: There were 52694 observations read from the data set TTT.TRMS_20210421.
NOTE: There were 37624 observations read from the data set TTT.TRMS_20210422.
NOTE: There were 31465 observations read from the data set TTT.TRMS_20210423.
NOTE: There were 33209 observations read from the data set TTT.TRMS_20210424.
NOTE: There were 9406 observations read from the data set TTT.TRMS_20210425.
NOTE: There were 6829 observations read from the data set TTT.TRMS_20210426.

 

NOTE: The data set Test has 83573 observations and 4 variables.

 

 


You did not run out of space.  You produced only 83,573 observations because you used MERGE instead of SET.   Merge joins the first obs from each dataset into a single obs, then the second obs into each dataset into one output obs, etc, while a SET statement would concatenate observations.

 

However, I am surprised that you got 83,573.  It should have been equal to the largest input data set (looks like 60,246 obs from 20210420).

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
TheNovice
Quartz | Level 8

Wow, thank you so much. I don't use the merge function much. Bad mistake. The combine worked correctly now.

 

as for the output i got, I had only posted a portion of the logs for ease of reading.

 

thank you so much

mkeintz
PROC Star

BTW, why are you concatenating all the data sets into a data set file?  You could instead concatenate into a data set view.   The view would be nothing more than a just-in-time concatenation, and you would save disk space by avoiding needless duplication of the data (you were going to keep the daily files, right?).

 

data ttrm_view / view=ttrm_view;
  set ttt.trms:   open=defer;
run;

This will take about a half-second to create, yet you can treat it just like the data set file you proposed to create, (even though it doesn't add to your disk requirements).  I.e. you can apply PROC FREQ, PROC REG, etc. to ttrm_view.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
TheNovice
Quartz | Level 8
oh, i didn't even know that was possible. I will do this instead. Can I still apply other sql procedures to it ?

The daily files are kept in the teams local folder.
mkeintz
PROC Star

@TheNovice wrote:
oh, i didn't even know that was possible. I will do this instead. Can I still apply other sql procedures to it ?

The daily files are kept in the teams local folder.
  1. Yes.

  2. If the "teams local folder" is local to your process then fine.  But if it is in a remote location, then there could be an advantage to making a local data set FILE instead of a data set view - due to the time penalty of accessing remote data whenever the data set view is utilized. 
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
TheNovice
Quartz | Level 8

Thank you again

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1164 views
  • 2 likes
  • 2 in conversation