BookmarkSubscribeRSS Feed
najoie
Obsidian | Level 7

Hello sas community,

 

I hope you are doing well.

 

Please i wanna  optimise this part of code can you help because i dont use much the  if and do .... 

 


/* Union results base */
proc sql;
create table lib.globsin_rc_union_&suffixe as

select 2010 as vision, a.* from lib.glob_sin_rc_0910 as a /* A modifier */
union all
select 2011 as vision, a.* from lib.glob_sin_rc_0911 as a
union all
select 2012 as vision, a.* from lib.glob_sin_rc_0912 as a
union all
select 2013 as vision, a.* from lib.glob_sin_rc_0913 as a
union all
select 2014 as vision, a.* from lib.glob_sin_rc_0914 as a
union all
select 2015 as vision, a.* from lib.glob_sin_rc_0915 as a
union all
select 2016 as vision, a.* from lib.glob_sin_rc_0916 as a
union all
select 2017 as vision, a.* from lib.glob_sin_rc_0917 as a
union all
select 2018 as vision, a.* from lib.glob_sin_rc_0918 as a
union all
select 2019 as vision, a.* from lib.glob_sin_rc_0919 as a
union all
select 2020 as vision, a.* from lib.glob_sin_rc_0920 as a

order by vision,exe;
quit;

 

Thanks

 

regards

12 REPLIES 12
PaigeMiller
Diamond | Level 26

Before you try to optimize the code, you would want to optimize the data structure. Without commenting about your specific code, one of the problems is that the data structure is poor. Splitting data up like this, so you have a data set for each year, is in general a very poor practice that makes coding harder. So if you have any control whatsoever about how the data is supplied, then create one large data set that has all the years of interest. I am thankful that whoever created the databases that I work with did not try to separate the databases into yearly database tables.

 

As far as your code goes, this can't possibly work (I have only shown a fraction of the code)

select 2010 as vision, a.* from lib.glob_sin_rc_0910 as a /* A modifier */
union all
select 2011 as vision, a.* from lib.glob_sin_rc_0911 as a

You can't have multiple data sets identified "as a", each one has to have a different letter identifier. Select 2010 is also not valid SAS code, as after SELECT should be a variable name, and 2010 is not a variable name.

 

So, if I were you, I would fix the data structure first, and then coding is easy. If that's not possible, then you need to create working code for a simple case, such as just the three rows above covering two different years. Once you get that to work, then the rest ought to be easy.

 

Another alternative is to use something like a DATA step SET command, for example

 

set lib.glob_sin_rc_09:;

Whether or not that works for you is something I don't know, you'd have to try it.

 

--
Paige Miller
najoie
Obsidian | Level 7

@PaigeMiller  Hi, thank you for your reply with all details that you noticed !!! 

 

So actually juste to explain why do i need to create this table : 

 

I have to prepare the data for each year  fro example  from  [2010,2020] because i have to end for each year the data like this : 

for year = 2010 i have to end the data at 31/12/2010 .... year = 2019 i have to end the data at 31/12/2019.

Then i create a table to make all the base unified : at the end i should have all the exercices and vues in the same table ...

 

I was thinking about to make  %Do ... and make like variable for exercices to make it like a macro to add table after table until i have all the data and years on it .

 

That is my goal ... 

If you have any suggestion i am open for more details ...

regards 

PaigeMiller
Diamond | Level 26

My comment was to go back to the method that creates these input data sets, and restructure the input data so it is not split by years. If that is something you have control over. Then your coding is much simpler.

--
Paige Miller
najoie
Obsidian | Level 7
Ok thank you so much, i will try to follow these instruction to prepare the data
SASKiwi
PROC Star

If your tables have identical column names then PROC APPEND or PROC DATASETS with the APPEND statement will work far quicker than SQL or DATA steps.

najoie
Obsidian | Level 7
@SASKiwi, Hello !
Yes it is, i have identical column names and the same column for all tables i want that be unified in one table.
Ok i will try your solution thank you so much for yr reply....
regards 
najoie
Obsidian | Level 7

@PaigeMiller  ok thank you so much i will try what you advice me to do ... 

 

and if any new posts related to this case for sharing more advices  it is welcoming...

 

RichardDeVen
Barite | Level 11

Not sure why you choose to use the word "optimisation" in your title.

 

Regardless, the question is the same as one found on Stackoverflow "concatenating large number of tables in sas" with answers

 

[have your macro] do all stacking in a single SQL statement

and

 

Use DATA Step with SET

 

najoie
Obsidian | Level 7
Thank you for sharing.

Using " optimisation" mean that i want something making my code with less lines and with a spead result of my unified data.
ChrisNZ
Tourmaline | Level 20

Optimising code is generally not understood as reducing the volume of code.

The third factor to consider (after code performance and code size) is code legibility.

Of the 3, size is the lowest priority.

 

I do not necessarily believe that tables split by month are bad. It depends on the volume and the usage.

SAS is not always fast processing huge tables. Hundreds of variables and millions of rows are fine, but if that's your monthly snapshot, creating a single table over many years may be detrimental, especially if you seldom process all years together.

 

What you may want to add in the original tables are ID columns, for example CALENDAR_YEAR or/and FISCAL_YEAR.

If the monthly tables are sorted by EXE, the yearly sorted table can simply be:

  data YEARLY;

    set LIB.GLOB_SIN_RC_0910 - LIB.GLOB_SIN_RC_0920;

  run;

No need to sort the data, it is already sorted, huge resource savings there compared to SQL.

If you don't have ID variables, you should -as mentioned-, but as a workaround you can use option INDSNAME to derive it in the data step above.

 

Maybe you don't need to duplicate the data at all. If you don't use the view repeatedly, this might be enough:

  data YEARLY/view=YEARLY;

    set LIB.GLOB_SIN_RC_0910 - LIB.GLOB_SIN_RC_0920;

  run;

 

If you need the SORTED flag to be set (because, for example proc SQL is used later, and doesn't know the data is sorted and wants to sort it), you can add a non-validated SORTED flag:

  data YEARLY_V(sortedby=MONTH EXE)/view=YEARLY_V;

    set LIB.GLOB_SIN_RC_0910 - LIB.GLOB_SIN_RC_0920;

  run;

or you can validate the sorted order without re-sorting:

  data YEARLY_V/view=YEARLY_V;

    set  LIB.GLOB_SIN_RC_0910 - LIB.GLOB_SIN_RC_0920;

  run;

  proc sort data=YEARLY_V out=YEARLY presorted;

    by MONTH EXE;

  run;

 

 

 

 

 

 

 

najoie
Obsidian | Level 7
Thank you so much !!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 912 views
  • 4 likes
  • 5 in conversation