BookmarkSubscribeRSS Feed
JacobSimonsen
Barite | Level 11

I have tought about that solution (the hash and iterator).  I think it will work in far most cases. Only problem is, that it require me to load all data into the a hash-object, which in worst case can cause some memory-problems. But it is not so difficult to implement, so I think I will try it out.

If I under stand the suggestion from FriedEgg correct, then I need to create an index on my source data (which is not a problem), and the key= allow me to visit any observations as many times I want. I don't think it will work efficient enugh, my experience with indexes is that it becomes slow if I want to go through all observations in a large dataset. It is only efficient if I only want to look up a few observations from a possible large dataset.

Patrick
Opal | Level 21

I believe - only believe - that what FriedEgg suggests would require the Federation Server. Is this how you're accessing your data?

FriedEgg
SAS Employee

All I am suggesting is that is it possible to emulate the function of using set with the key= option by using the SQLSTMT package in DS2.  The usefulness of said proposition is dependent on the situation at hand and can certainly lead to either excellent or very poor performance (or anywhere in between).  As mentioned, another option is to utilize the hash packages.  However, both of these choices clearly do not directly mimic set point=, which is certainly a functionality that seems beneficial to have available.

JacobSimonsen
Barite | Level 11

Thank you, both, for your suggestions.

Though, I tend to conclude that DS2 at present will make my program more complex and/or less timeefficient than an ordinary datastep. Even though DS2 has its matrix-package it is easier to write the code in ordinary datastep language (using my homemade matrix package https://communities.sas.com/ideas/1570). That is of course not true in general, just for my present problem. It will be interesting to see how DS2 will be developed in the future.

jakarman
Barite | Level 11

I am missing the insights on programming language concepts. It is part of "applied computer science" Computer science - Wikipedia, the free encyclopedia as are a lot of other dedicated area-s.  The association to a physical piece of hardware "the computer" may be is confusing. In my language the word "informatica" is used. Probably better as it about processing of information.  
Using an Array Hash or something is more a data-structure that can be supported by using a language. Data structure - Wikipedia, the free encyclopedia

For transmitting data you are needing conventions for getting meaningfull descriptions. JSOM and XML are examples of those JSON - Wikipedia, the free encyclopedia. These are not of the same kind as data-structures within a programming language. Getting the objects well described is a pre-requisite for understanding them.

The SAS datastep language.

This indication of datastep is needed  as SAS is having a lot of languages in their eco-system. Every proc can be an own language on his own. Proc sql - proc ds2 as examples.

The datastep language is designed as a 4GL. It is different to Cobol/Fortran REXX and many more as it is processing the records in an automatic way, by that having a PDV.

The C-language is a weird one as it used to build OS it is more a 2GL (processor level) but also using the same style as Fortran as a 3GL. 

With that sequential approach of the datastep language there are advantages and disadvantages.

Advantage: Sequential processing can be extremely efficient when all incoming and outgoing data is also sequential.

Disadvantage: There is a limit on throughput / speed as all actions are a sequential process, by that is missing the opportunity of parallel processing.

I mentioned REXX as a language as that one is very lazy on type declarations like the sas datasteps but it has no PDV. It has no automatic record advancing and by that can interpreted own code language instream.

The array and hash concepts are not of the 4gl within the PDV but of the 3GL level (like Fortran/Cobol). The difference on those concepts 3Gl 4Gl  is for many difficult to understand.

The DS2 language

The Proc DS2 is an own language. Some are based on the SCL (SAS/AF) as of methods/declarations. Some are using naming conventions being copies form "R" the packages.

It is an other approach not based on the assumption of sequential processing but supporting multi-threading and supporting more consistent DBMS types.

We will see how it will find a place for usage.

---->-- ja karman --<-----

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 19 replies
  • 1816 views
  • 8 likes
  • 6 in conversation