Solved: Re: PDS Members Create for BY Variable

rkumar23 · Posted 11-24-2014 10:30 PM

Could somebody throw me an idea where I like to create Seperate PDS members for BY Variable ....

For example based on below GLIBSEQN like to create the Different PDS members ...

DATA LAB1;

INFILE DATALINES ;

INPUT

DAT DATE9. TIM TIME5. GLIBSEQN $ CLUSTER $ DLIB $ DISTNAME $ Deferr $8.

threshold $8. ;

DATALINES;

01OCT14 0:00 10000 04 10015 RVDRVA15 0:00 1:00

01OCT14 0:00 10000 02 10018 KCDRVA18 0:02 1:00

01OCT14 0:00 10000 05 10016 KCDRVA16 0:01 1:00

01OCT14 0:00 10000 00 10017 RVDRVA17 0:00 1:00

01OCT14 0:00 20000 03 1001A KCDRVA1A 0:02 1:00

01OCT14 0:30 20000 02 10018 KCDRVA18 0:02 1:00

01OCT14 0:30 20000 05 10016 KCDRVA16 0:01 1:00

01OCT14 0:30 20000 00 10017 RVDRVA17 0:00 1:00

proc print data=lab1;run;

jakarman · Posted 11-28-2014 05:21 AM

I have no Mainframe at my hands but I could try to write machine independend code putting it all together.

With the used parathesis the members in a PDSE should be created as long as the data is clustered (at notsorted at the by)

Putting the libname filenames in top is like having it in JCL. The adding of A is needed for naming conventions in the Z/OS member approach

42 ;

43 /* test in UE for PDSE */

44 libname test "/folders/myfolders/test";

NOTE: Libref TEST was successfully assigned as follows:

Engine: V9

Physical Name: /folders/myfolders/test

45 filename test "/folders/myfolders/test";

46

47

48 /* machine independent code create sas dataset */

49 DATA test.LAB1;

50 INFILE DATALINES ;

51 INPUT

52 DAT DATE9. TIM TIME5. GLIBSEQN $ CLUSTER $ DLIB $ DISTNAME $ Deferr $8.

53 threshold $8. ;

54 DATALINES;

NOTE: The data set TEST.LAB1 has 8 observations and 8 variables.

NOTE: DATA statement used (Total process time):

real time 0.03 seconds

cpu time 0.04 seconds

63 proc print data=test.lab1;run;

NOTE: There were 8 observations read from the data set TEST.LAB1.

NOTE: The PROCEDURE PRINT printed page 7.

NOTE: PROCEDURE PRINT used (Total process time):

real time 0.09 seconds

cpu time 0.10 seconds

64

65 DATA _null_ ;

66 Set test.LAB1;

67 by GLIBSEQN ;

68 length filevarpds $256 ;

69 retain filevarpds ;

70 if first.GLIBSEQN then do;

71 filevarpds="%sysfunc(pathname(test))"||"(A"||TRIM(GLIBSEQN)||")" ;

72 end;

73 file outpds filevar = filevarpds ;

74 put _all_ ;

75

76 run;

NOTE: The file OUTPDS is:

Filename=/folders/myfolders/test(A10000),

Owner Name=root,Group Name=root,

Access Permission=-rwxrwxrwx,

Last Modified=28 november 2014 05:27:08 uur

NOTE: The file OUTPDS is:

Filename=/folders/myfolders/test(A20000),

Owner Name=root,Group Name=root,

Access Permission=-rwxrwxrwx,

Last Modified=28 november 2014 05:27:08 uur

NOTE: 4 records were written to the file OUTPDS.

The minimum record length was 192.

The maximum record length was 192.

NOTE: 4 records were written to the file OUTPDS.

The minimum record length was 192.

The maximum record length was 195.

NOTE: There were 8 observations read from the data set TEST.LAB1.

NOTE: DATA statement used (Total process time):

real time 0.02 seconds

cpu time 0.04 seconds

77

78 /* http://support.sas.com/documentation/cdl/en/hosto390/67326/HTML/default/viewer.htm#p0agwv4mwi9x6xn17... */

79 /* http://support.sas.com/documentation/cdl/en/hosto390/67326/HTML/default/viewer.htm#n0e5aa1mquetdin1m... */

80

---->-- ja karman --<-----

View solution in original post

Reeza · Posted 11-24-2014 10:52 PM

I don't know what you mean , can you elaborate on your question.

Here are two links on how to ask a good question.

How do I ask a good question? - Help Center - Stack Overflow

rkumar23 · Posted 11-25-2014 01:48 AM

I have put the data hopefully that explain what I was asking...

Reeza · Posted 11-25-2014 02:19 AM

What is different "PDS members"?

Are you trying to create separate data sets based on the variable GLIBSEQN? If so, why? SAS group by processing is quite effective.

Regardless, if you need that, see

Split Data into Subsets - sasCommunity

The end of that page at the bottom has some links to other such solutions, or searching this site with "split data into subsets" shows a bunch of different ways as well.

rkumar23 · Posted 11-26-2014 01:06 AM

I mean from Different PDS members is to have unique Member created in the PDS(Partition dataset) for each by Variable so ..this is what my program is now...

DATA LAB1;

INFILE DATALINES ;

INPUT DATE DATE9. TIME $ GLIBSEQN $ CLUSTER $ DLIB $

DISTNAME $ DEFERR $ THRESHOLD $ ;

DATALINES;

01OCT14 0:00 10000 04 10015 RVDRVA15 0:00 1:00

01OCT14 0:00 10000 02 10018 KCDRVA18 0:02 1:00

01OCT14 0:00 10000 05 10016 KCDRVA16 0:01 1:00

01OCT14 0:00 10000 00 10017 RVDRVA17 0:00 1:00

01OCT14 0:00 20000 03 1001A KCDRVA1A 0:02 1:00

01OCT14 0:30 20000 02 10018 KCDRVA18 0:02 1:00

01OCT14 0:30 20000 05 10016 KCDRVA16 0:01 1:00

01OCT14 0:30 20000 00 10017 RVDRVA17 0:00 1:00

;

DATA TEST(INDEX=(GLIBSEQN));SET LAB1;RUN;

DATA KEEPTEST;

declare hash EDIT;

EDIT = _new_ hash(ordered: 'ascending');

EDIT.definekey('_unique_key');

EDIT.definedata('DATE', 'TIME', 'GLIBSEQN','CLUSTER','DISTNAME');

EDIT.definedone();

do until(last.glibseqn);set test;by glibseqn;_unique_key+1;

EDIT.ADD();

END;

EDIT.OUTPUT(DATASET:'C'||GLIBSEQN);

Above program is creating TWO output dataset ....I may be lost with options now but Do you have thought how could i write these to the physical dataset example we will have KEEPTEST.C10000 and KEEPTEST.C20000 these need to be in the FILE C1000 ...

Reeza · Posted 11-26-2014 07:22 AM

There's a hash example in the link above.

Assuming the data is sorted, I usually use a data step with call execute instead:

proc sort data=sashelp.class out=class; by sex; run;

data _null_;

set class;

by sex;

if first.sex then do;

call execute("Data Sex_"||sex||"; Set class;"||

"Where Sex="||quote(sex)||";run;");

end;

run;

Ksharp · Posted 11-26-2014 09:25 AM

Assuming the data is sorted.

DATA LAB1;                                                      
INFILE DATALINES ;                                              
INPUT DATE : DATE9. TIME $ GLIBSEQN $ CLUSTER $ DLIB $             
     DISTNAME $ DEFERR $ THRESHOLD $ ;   
format DATE  DATE9.; 
DATALINES;                                                       
01OCT14  0:00 10000  04  10015  RVDRVA15      0:00     1:00      
01OCT14  0:00 10000  02  10018  KCDRVA18      0:02     1:00      
01OCT14  0:00 10000  05  10016  KCDRVA16      0:01     1:00      
01OCT14  0:00 10000  00  10017  RVDRVA17      0:00     1:00      
01OCT14  0:00 20000  03  1001A  KCDRVA1A      0:02     1:00      
01OCT14  0:30 20000  02  10018  KCDRVA18      0:02     1:00      
01OCT14  0:30 20000  05  10016  KCDRVA16      0:01     1:00      
01OCT14  0:30 20000  00  10017  RVDRVA17      0:00     1:00      
;               
run; 
data _null_;
 if _n_ eq 1 then do;
 if 0 then set LAB1;
  declare hash h(multidata:'y');
  h.definekey('GLIBSEQN');
  h.definedata('DATE','TIME','GLIBSEQN','CLUSTER','DLIB','DISTNAME','DEFERR','THRESHOLD');
  h.definedone();
 end;
set LAB1;
by GLIBSEQN;
h.add();
if last.GLIBSEQN then do;h.output(dataset:cats('C',GLIBSEQN));h.clear();end;
run;

Xia Keshan

rkumar23 · Posted 11-26-2014 09:04 PM

Yes that's what I been able to create output dataset however objective is with these output datasets PDS member name is needed to be created dynamically like from above we will have C10000 & C20000 ...Now These should be created as members and then corresponding Observation need to be copied to these members ....So basically

PDS = xxxxxx should have C10000 and C20000 members created and WORK.C10000 & WORK.C20000 Observations be copied over to these coresponding members...Again can not hard code them as above is just an example I can have it changing everytime ....

Thanks for your help in advance...

Reeza · Posted 11-26-2014 09:36 PM

code creates the work.c10000 and work.c20000 files.

I'm lost as to what you mean by PDS = xxxxxx .

Does SAS support PDS, I see SPD but that doesn't seem to be what you're referring to.

rkumar23 · Posted 11-26-2014 09:38 PM

Yes PDS is Partition datasets .....it's Mainframe files....and easily understood if mainframe personnel

Reeza · Posted 11-26-2014 09:41 PM

So once you have the work files what's the next step, uploading them to a mainframe?

rkumar23 · Posted 11-26-2014 09:43 PM

This whole processing is on the Mainframe once work files are created I have to use name of the Work file as above example had C10000 and C20000 ....Need to create Members in the Partition dataset(PDS) using these names and then corresponding observation need to be copied from these Work files(i.e. work.c10000 & work.c20000) ...

Reeza · Posted 11-26-2014 09:47 PM

I don't understand, but I don't work on mainframes

Perhaps consider itemizing your to do list. I'm sure the code xia has presented and mine accomplish a portion of what you need. I'm still uncertain as to what the remaining part is. You can copy files over using proc copy or proc datasets.

Tom · Posted 11-26-2014 09:51 PM

First let's assume that by PDS member you are using IBM mainframe terminology for text files and and not SAS datasets.

You can use the FILEVAR option to dynamically generate the DSN. So to write to TOP.LEVEL.C100000 and TOP.LEVEL.C20000 etc. your program would look something like this.

data _null_;

set lab1 ;

length filename $200 ;

filename = catx('.','TOP.LEVEL',cats('C',GLIBSEQN)) ;

file dummy filevar=filename ;

put .... ;

run;

But I am not sure if you can do that for PDS members. You could try it by generating values for FILENAME that look like TOP.LEVEL.PDS(G10000) instead.

jakarman · Posted 11-27-2014 10:52 AM

I hope you are not using the PDS structure but the PDSE one. The difference between those two is that the first one is suffering from need to be compressed regular and the last one not.

Sometimes the PDSE structure is indicated in TSO (allocating the space) as "library". Nice word good for the unwanted confusion.

Within in a mainframe environment you should not use the Filenames & libnames in your code. That habit is just bad practice and you will need to solve that when there is an develop / test / acceptance / production release management approach for you code. A DD statement in the JCL will be the requirement. That is your fileref/libref.

Do you have a limited number predefined naming (split in 5-10) you can code:

...

file <filref>(membr1) ;

...

file <filref>(membr2) ;

...

This approach can be followed on all type of machines (Unix/Windows/Mainframe).

The doc can be found at: http://support.sas.com/documentation/cdl/en/hosto390/67326/HTML/default/viewer.htm#n0rog60ordqc10n1i...

this example I referred to:

This next example assigns the fileref MYPDSE to the PDSE and then uses the fileref in a simple DATA step:

/* PDSE Example */

filename mypdse 'sales.div1.reg3' disp=shr;

data a;

x=1;

file mypdse(june97);

put x;

file mypdse(jul97);

put x;

run;

Still needing the full path in your program the you can retrieve that with:

%let fullname = %sysfunc(pathname(<fileref>)) ;

---->-- ja karman --<-----

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away