BookmarkSubscribeRSS Feed
DeepakSwain
Pyrite | Level 9

Hi there,

I am having a dataset which is updated every month with addition of million records. It is having a couple of variables containing free text with length upto 32767. While taking subset using PROC SQL, it takes couple of days. Is there any efficent way to decrease the processing time e.g. using DS2. Please let me. 

 

Thanks in advance. 

 

Swain
4 REPLIES 4
art297
Opal | Level 21

The title of your question, and specifics in your description, refer to two totally different operations. Are you taking a subset of a large dataset, or appending new data to an existing file?

 

If it's the latter, you might one to try proc append (see: http://support.sas.com/documentation/cdl/en/lrcon/65287/HTML/default/viewer.htm#n1tgk0uanvisvon1r26l... )

 

Also, have you ever done an analsis of the "couple of variables" that have lengths of 32767? You can probably shorten the time quite a bit of you can reduce the lengths of those variables.

 

Art, CEO, AnalystFinder.com

 

DeepakSwain
Pyrite | Level 9

Hi art297,

 

I am interested to take subset of an existing dataset. I believe that once the subset is taken, the variable having free text with length $32767, which can not be changed as tool dependent, may be dropped to improve the efficiency. 

 

Thank you in advance for your kind reply. 

 

Regards,

Swain
art297
Opal | Level 21

Is your desired subset just the new records that have been added (i.e., the last x records from a given point), or could they exist anywhere in the dataset? If the latter, I'd suggest using a where option in a set (datastep) or from (sql) statement. If it's the latter, I'd suggest saving the last record number to a data file, then using a firstobs dataset option.

 

Art, CEO, AnalystFinder.com

 

LinusH
Tourmaline | Level 20
If you are using a data step make sure to exclude this column via the drop data set option in the SET statement.
If your subset is relatively small investigate if you can have an index in place that supports your query.
Data never sleeps

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 2029 views
  • 3 likes
  • 3 in conversation