BookmarkSubscribeRSS Feed
Planck
Obsidian | Level 7

Hi,

 

I have a table A with a lot of column (something like 70) including this three ones:

 

* ID

* DURATION

* DATE

 

 I want to duplicate every row according to the value of DURATION, and recalculate DATE (one month more evry time), and then recalculate every other columns according to the new DATE value.

 

data B;
set A;

do i = 1 to DURATION;
DATE = INTNX('month', DATE, +1 , 'e');
[...];
output;
end;
run;

 

The problem is, the number of rows in A is 60K

DURATION is often very high, so the number of rows in B at the end of the duplication is something like 35M

 

Vrey long, and very huge volume. I would like to make it parallel. Divide in 6 part (or more if necessary) of 10K rows and run them in parallel, then only do a simple union.

 

Do you know how I can do this in SAS?

Thanks

7 REPLIES 7
art297
Opal | Level 21

You haven't said what you are going to do with the file after expanding it.

 

For most SAS procedures you can probably avoid expanding the file and simply use DURATION as a WEIGHT variable.

 

Art, CEO, AnalystFinder.com

Planck
Obsidian | Level 7

Thanks for the answer.

 

But I don't understand what do you mean by WEIGHT variables. Can you be more explicit?

 

After, with the table B, I need to extract specific rows to create a table C, and then export in Excel the table C...

art297
Opal | Level 21

May or may not be relevant to your needs. Here is a link that explains WEIGHT and FREQ statements in SAS procedures: http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473731.htm

 

Thus, for a proc freq, it would count each record as representing n (i.e., weight or freq) records without forcing you to build the large file you were describing.

 

Are you really planning on exporting a 60 million row file to Excel?

 

Art, CEO, AnalystFinder.com

 

Planck
Obsidian | Level 7

You are kidding not 60 millions ^^.

 

My first table A contains 60.000 rows. My table B contains 35 milions rows, and my table C is back to 60.000 rows: my table C will extract from B only a specific calculated DATE.

So I will export in Excel only 60.000 rows (table C)

 

 

But I can't understand the link between WEIGHT/FREQ and my needs. It is more for statical needs, me I just need to transform data...

 

 

 

art297
Opal | Level 21

Agree weight or freq probably wouldn't be relevant. However, couldn't something like the following serve as an alternative to creating the large file?

 

data have;
  informat date date9.;
  input ID DURATION DATE;
  cards;
1 10 1jan2016
2 3 5mar2016
3 15 8sep2016
;

data want;
  set have (where=('01jan2017'd between date and intnx('month', date, duration-1 )));
run;

Art, CEO, AnalystFinder.com

Planck
Obsidian | Level 7

Unfortunately not! 😞

 

I didn't go into details of all 70 columns in my explanation, but among them some are calculated in a complex way and I truely need to create the large table B, before reducing for table C.

The question is not about finding an alternative, but to accept it and find a solution parallelising it, because... I have no choice!

ballardw
Super User

@Planck wrote:

Unfortunately not! 😞

 

I didn't go into details of all 70 columns in my explanation, but among them some are calculated in a complex way and I truely need to create the large table B, before reducing for table C.

The question is not about finding an alternative, but to accept it and find a solution parallelising it, because... I have no choice!


Are you sure you can't post a little example data and then where things go after that to the final table C?

 

I have seen lots of processes that were either written before new procedures were available in SAS or were a direct translation from another data system and did not use SAS tools well. A frequent example are data steps that go through lots of loops and retaining variables to calculate common statistics such as mean and standard deviation that could be done with four or five lines of code using Proc Means or summary. Or forcing data into a specific layout because "that's the way the report looks" instead of using the SAS report procedures such as Proc Report or Tabulate.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 821 views
  • 0 likes
  • 3 in conversation