I'm in a mainframe environment and looking to read members of a PDS looking for certain values. The result from that are then merged with another sas dataset to create a csv file. For the most part I have something that works.
My issue seems to be around when my PDS member contains more than one of that values I'm looking for, AND this PDS member has more than one entry in my sas dataset to be merged with.
I'm not sure how to give a sample of my PDS within the SAS code that you could use as an example. I'll try to illustrate here what those PDS members look like –
Mem01
Command, value1
Mem02
Command,value1
Command,value2
Mem03
Command,value1
This is the SAS dataset which the above gets merged with –
job01 mem01
job02 mem02
job02 mem02
job03 mem02
job03 mem02
job04 mem03
So in this case I expect my output to look like –
Mem01,job01,command,value1
Mem02,job02,command,value1
Mem02,job02,command,value2
Mem02,job03,command,value1
Mem02,job03,command,value2
Mem03,job04,command,value1
However what I'm seeing is this –
Mem01,job01,command,value1
Mem02,job02,command,value1
Mem02,job02,command,value2
Mem02,job03,command,value2
Mem02,job03,command,value2
Mem03,job04,command,value1
For members with multiple commands, that are associated to multiple jobs, it seems the value retains the setting of the last occurrence from the first job. In this case, the first occurrence of mem02,job03,command,value2 should reflect value1, not value2.
This is what my code looks like:
data have;
infile cards missover;
input job $ 02-09
member $ 11-18
;
cards;
job01 mem01
job02 mem02
job02 mem02
job03 mem02
job03 mem02
job04 mem03
;
run;
proc sort data=have
OUT=have_NoDups
nodupkey ;
by member ;
run ;
data file2;
set have_NoDups;
infile 'aaaa.bbbbbb.cccc.pds' memvar=member end=done;
do until (done);
input msg $ 01-72 ;
if msg =: 'COMMAND' or msg =: 'FRED' then do;
parm1 = scan(msg,1,",") ;
parm2 = scan(msg,2,",") ;
memnam = member ;
output;
drop msg;
end;
end;
run;
/*********************************************************************/
/* Merge the jobs file with the commands file */
/*********************************************************************/
DATA file3;
Merge have file2 (rename=(memnam=member));;
by member;
proc print;
run;
Any thoughts as to what I'm missing here?
Hope that makes sense.
I not sure I am following this discussion, but it sounds to me like you have some metadata that lists which members (files) you want to read. Do you just have one PDS (folder) or are their multiple? Either way if the goal is to check only the files that need to be check you should first get the list of files. So if you start with something like this that maps members to jobs.
data job_members ;
input job :$10. member :$32. ;
cards;
job01 mem01
job02 mem02
job02 mem02
job03 mem02
job03 mem02
job04 mem03
;
You first want to reduce it to just a list of the members.
proc sort data=job_members(keep=member) out=members nodupkey;
by member;
run;
Then you can use that to drive the process of reading the text files.
Do you know the name of the PDS? Let's assume it is named PROD.PAYROLL.JCL like in this example:
So then your step to read the text files is something like:
data member_commands ;
set members ;
length dsn $80 ;
dsn=cats('PROD.PAYROLL.JCL(',member,')');
infile text filevar=dsn end=eof;
do while (not eof);
... here is you code to read the lines and figure out what you want ..
.. make sure to include an OUTPUT statement when you find something interesiting ...
end;
run;
Now you can re-combine the two datasets based on the value of MEMBER.
proc sql;
create table job_member_commands as
select *
from job_members A
inner join
member_commands B
on a.member = b.member
order by a.job,a.member
;
quit;
From a SAS processing perspective your PDS members are just like text files? And you input the text in PDS members and then get a SAS data set per member with two columns in the form {key,value}. Is that correct?
If you've solved reading the PDS members already then can you please post sample data how the resulting SAS table(s) look like? The ones which you then want to merge with your table have if I've got that right.
This is the output from a proc print for thw HAve &File2 datasets immediatley before the merge:
This is what the HAVE dataset looks like:
Obs job member
1 job01 mem01
2 job02 mem02
3 job02 mem02
4 job03 mem02
5 job03 mem02
6 job04 mem03
This is what the FILE2 dataset looks like:
Obs job command parm2
1 job01 COMMAND PARM1
2 job02 COMMAND PARM1
3 job02 COMMAND PARM2
4 job04 COMMAND PARM1
And this is what the output looks like after merging HAVE with FILE2.
Obs job member command parm2
1 job01 mem01 COMMAND PARM1
2 job02 mem02 COMMAND PARM1
3 job02 mem02 COMMAND PARM2
4 job03 mem02 COMMAND PARM2
5 job03 mem02 COMMAND PARM2
6 job04 mem03 COMMAND PARM1
Obs job member 1 job01 mem01 2 job02 mem02 3 job02 mem02 4 job03 mem02 5 job03 mem02
Your first dataset looks like a directory listing (members in PDS). If so then why is #3 a duplicate of #2.?
Not sure how you create this table but may-be a simple DISTINCT (or proc sort with nodupkey) will resolve the problem.
The first dataset(have) is built based on a cross reference file. Its indicating that those members are being called by those jobs. In this case its saying mem02 is being called twice by job02, and twice by jobs03. That is valid.
That’s why I had to include the proc sort nodupkey, to get rid of those duplicates. As without that the data file2 step stopped processing when it reached them. So I read the have_NoDups file into data file2 to allow it to process, but then I need to merge back with the original have file to ensure I don't drop records.
There isn't any parameter value for job03 in your sample data so not sure why it appears in your result set.
Have a look at below. Is that returning what you're after?
data job_member;
input (job member) ($);
datalines;
job01 mem01
job02 mem02
job02 mem02
job03 mem02
job03 mem02
job04 mem03
;
data file2;
input (job command parm) ($);
datalines;
job01 COMMAND PARM1
job02 COMMAND PARM1
job02 COMMAND PARM2
job04 COMMAND PARM1
;
proc sql;
/* create table want as*/
select distinct
f2.job,
jm.member,
f2.command,
f2.parm
from
job_member as jm
inner join
file2 as f2
on jm.job=f2.job
;
quit;
Job03 isn't appearing in my proc print of File2 as its getting dropped by the proc sort nodupkey. The key is on member and as mem02 is used in both Job02 and job03, its dropping job03 from this.
If I add Job03 to your file2 definition, it does work –
data file2;
input (job command parm) ($);
datalines;
job01 COMMAND PARM1
job02 COMMAND PARM1
job02 COMMAND PARM2
job03 COMMAND PARM1
job03 COMMAND PARM2
job04 COMMAND PARM1
;
Produces this –
job member command parm
-------------------------------------
job01 mem01 COMMAND PARM1
job02 mem02 COMMAND PARM1
job02 mem02 COMMAND PARM2
job03 mem02 COMMAND PARM1
job03 mem02 COMMAND PARM2
job04 mem03 COMMAND PARM1
My issue then is how do I produce my file2? I need to read the contents of each member to obtain the command and parm. Currently I'm passing the member name from the have dataset, though that requires me to use the proc sort nodupkey, which causes the issue with job03 being dropped.
By passing the member name from the have dataset into here, I'm trying to avoid reading the whole PDS, as it contains thousands of members I don't require. Though I may have to consider doing that now.
I not sure I am following this discussion, but it sounds to me like you have some metadata that lists which members (files) you want to read. Do you just have one PDS (folder) or are their multiple? Either way if the goal is to check only the files that need to be check you should first get the list of files. So if you start with something like this that maps members to jobs.
data job_members ;
input job :$10. member :$32. ;
cards;
job01 mem01
job02 mem02
job02 mem02
job03 mem02
job03 mem02
job04 mem03
;
You first want to reduce it to just a list of the members.
proc sort data=job_members(keep=member) out=members nodupkey;
by member;
run;
Then you can use that to drive the process of reading the text files.
Do you know the name of the PDS? Let's assume it is named PROD.PAYROLL.JCL like in this example:
So then your step to read the text files is something like:
data member_commands ;
set members ;
length dsn $80 ;
dsn=cats('PROD.PAYROLL.JCL(',member,')');
infile text filevar=dsn end=eof;
do while (not eof);
... here is you code to read the lines and figure out what you want ..
.. make sure to include an OUTPUT statement when you find something interesiting ...
end;
run;
Now you can re-combine the two datasets based on the value of MEMBER.
proc sql;
create table job_member_commands as
select *
from job_members A
inner join
member_commands B
on a.member = b.member
order by a.job,a.member
;
quit;
Hi Tom – I think what you have outlined is what I'm currently doing. There is just one PDS file to be read. The members to be read from that file are identified in the first step creating the have dataset. However by executing the proc sort nodupkey, its dropping some data. For example Job03 from your data job_members would get dropped here.
As a result Job03 would be missing when the proc sql is executed.
At this point I'm trying to understand if there is a method that would allow me to combine the data here without having to read every member from my PDS(as there are thousands of members there I'm not interested in).
Please read my previous answer more closely.
If you don't want PROC SORT NODUPKEY to eliminate data then make sure to specify a different output dataset than the input dataset.
Thanks Tom. I did have separate input & output datasets within my proc sort nodupkey. I've added the (keep=member) as you had, and dropped the distinct from my proc sql to match yours.
Its looking much better now.
Appreciate all the assistance.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.