About LHV

Bazza669 · ‎04-09-2015

Hi LHV, Did you find an answer to this? Cheers Barry

LHV · ‎08-16-2012

Many thanks every one for your input. When I learned SAS, the SAS Certification Training Core Concepts V8 document says, " DATA step match-merging overwrites values of the like-named variable in the first data set in which it appears with values of the like-name variable in subsequent data sets." This wording is the main source for my confusion.

Ksharp · ‎04-25-2012

Never mind, It is my personal favor.

LinusH · ‎04-03-2012

Or rather here: http://support.sas.com/documentation/cdl/en/spdsug/63088/HTML/default/viewer.htm#n1fru1udmbaf6xn1ta8u681lp6fu.htm

FriedEgg · ‎03-22-2012

Performance is fickle so making broad generalization is difficult. I deal with datasets of these sizes regularly (from billions to 100's of billions, mostly web event data) and I have encountered situations that contradict all of the following but they are usually 'best' starting points. I make the following points with a starting assumption that there is no highly paralleling dbms available (Oracle Exadata, Teradata, Netezza, SPDS, Aster nCluster, Greenplum, Vertica, etc...), as in the vast majority of cases that system would beat what a single threaded SAS DATA step ever could. An index will typically only benefit a merge is the small table makes up under 20-30% of the larger table. Hash tables will be very fast however the ability to load a file of these sizes is highly unlikely unless you are dealing with tables containing very few variables on a machine containing lots of RAM (my servers have 512GB and there are still many tables too large). Formats are also good approaches for large table merges. As mentioned you can use it to go from an id to a variable or a version that has not been pointed out is creating a format from merge key to record number and then using a set with point. Sometimes this is very quick, it is similar to utilizing an index and breaks down the higher the percentage of records you are reading gets, also there are memory limits for formats as well which has been pointed out. A option that has not been mentioned yet is using SPDE, if you have the I/O subsystem available to use it you can sometime see large performance increases because it will allow multi-thread I/O to the DATA step moving the burden of the task more onto RAM and CPU. Index's are also multi-thread read which can help performance with them as well. It also allows for indexes and data contains to be help in separate locations where the I/O systems can be optimized for the different types of usage. Indexs placed in storage meant for intense iops and random access, data on disk geared for sequential read throughput. Also in dealing with tables over 2 billion rows, if you are using a 32-bit operating system you need to use SPDE anyway... And finally, the best of these options is to always split these size files into logical groups of smaller files and performing the tasks in parallel and then set the results back together at the end if necessary. On another note with a related topic, I am very excited when it comes to big data and SAS with the new product SAS/ACCESS Interface to Hadoop!

Ksharp · ‎02-29-2012

I think you need privilege of Administrator to kill these process.

Tom · ‎02-17-2012

What are you actually trying to do? If you just want to assign the value of licensed to dd then you do not need a function. If you want to convert the numeric variable licensed to a character value then use the PUT function instead of the INPUT function. PUT always generates character result. INPUT results depends on the type of the informat used. That is why $12. generates character variables and 12. generates a numeric variable. There are also PUTN, PUTC, INPUTN and INPUTC which have been added over the years. They have the advantage of allowing the format/informat to come from a variable.

Online Status	Offline
Date Last Visited	‎09-01-2015 07:11 AM

Re: Changing source dataset while using SAS add-in for MS Excel

Changing source dataset while using SAS add-in for MS Excel

Re: strange merge issue

Re: Proc Tabulate/Report

Re: strange merge issue

Re: strange merge issue

strange merge issue

Re: Finding the latest of a series of data sets

Finding the latest of a series of data sets

WARNING: Parallel where evaluation suppressed

Efficient way of merging very large datasets

Re: Efficient way of merging very large datasets

Changing source dataset while using SAS add-in for MS Excel

strange merge issue

Re: Changing source dataset while using SAS add-in for MS Excel

Re: strange merge issue

Re: Finding the latest of a series of data sets

WARNING: Parallel where evaluation suppressed

Efficient way of merging very large datasets

Is there any statement to close opened data set

Input