11-23-2018
saivenkat
Obsidian | Level 7
Member since
08-23-2018
- 16 Posts
- 6 Likes Given
- 2 Solutions
- 3 Likes Received
-
Latest posts by saivenkat
Subject Views Posted 2119 11-23-2018 06:44 AM 2163 11-22-2018 06:44 AM 2203 11-22-2018 03:41 AM 4035 10-02-2018 04:24 PM 1524 10-02-2018 07:52 AM 2082 10-01-2018 01:42 AM 2102 09-30-2018 01:58 PM 1029 09-30-2018 12:08 PM 2164 09-28-2018 01:03 PM 2212 09-28-2018 11:13 AM -
Activity Feed for saivenkat
- Got a Like for Re: Accessing db2 column that has name more than 18 characters. 11-23-2018 07:11 AM
- Posted Re: Accessing db2 column that has name more than 18 characters on Administration and Deployment. 11-23-2018 06:44 AM
- Posted Re: Accessing db2 column that has name more than 18 characters on Administration and Deployment. 11-22-2018 06:44 AM
- Posted Accessing db2 column that has name more than 18 characters on Administration and Deployment. 11-22-2018 03:41 AM
- Posted Re: How does a Linux batch job connect to metadata? on Administration and Deployment. 10-02-2018 04:24 PM
- Posted Re: Loop through datasets and variables in source file on New SAS User. 10-02-2018 07:52 AM
- Posted Re: Performance measurement on SAS Data Management. 10-01-2018 01:42 AM
- Got a Like for Re: Dealing with repeated observations for the same ID. 09-30-2018 10:51 PM
- Posted Re: Performance measurement on SAS Data Management. 09-30-2018 01:58 PM
- Posted Re: Dealing with repeated observations for the same ID on SAS Programming. 09-30-2018 12:08 PM
- Posted Re: Performance measurement on SAS Data Management. 09-28-2018 01:03 PM
- Posted Performance measurement on SAS Data Management. 09-28-2018 11:13 AM
- Liked Re: Drop variables contain specific string or missing values for Kurt_Bremser. 09-27-2018 04:09 AM
- Liked Re: Drop variables contain specific string or missing values for himalayan. 09-27-2018 04:09 AM
- Liked Re: Drop variables contain specific string or missing values for novinosrin. 09-27-2018 04:09 AM
- Liked Re: Drop variables contain specific string or missing values for novinosrin. 09-27-2018 04:09 AM
- Posted Re: Drop variables contain specific string or missing values on SAS Programming. 09-27-2018 04:06 AM
- Posted Re: Drop variables contain specific string or missing values on SAS Programming. 09-26-2018 02:27 PM
- Posted Re: Drop variables contain specific string or missing values on SAS Programming. 09-26-2018 02:19 PM
- Posted Drop variables contain specific string or missing values on SAS Programming. 09-26-2018 12:31 PM
-
Posts I Liked
Subject Likes Author Latest Post 3 1 1 1 2 -
My Liked Posts
Subject Likes Posted 2 11-23-2018 06:44 AM 1 09-30-2018 12:08 PM
11-23-2018
06:44 AM
2 Likes
issue was resolved by upgrading the db2 client in mac, where sas compute installed
... View more
11-22-2018
06:44 AM
Hi, The table CLAIMJOB_CCD has 59 columns, out of which only the column name CAUSALPARTNOPREFIXWID exceeding the length of 18 characters and other column exists with the name CAUSALPARTNOPREFIX so the access issue persists against this table. SQL pass through working fine but ignoring the column CAUSALPARTNOPREFIXWID with the warning below PROC SQL ;
connect to db2 as db1 (database=xx USER=xxx PASSWORD="{SAS002}xxx" );
create table test as
select *
from connection to db1 (SELECT * FROM xxx.CLAIMJOB_CCD FETCH FIRST 3 ROWS ONLY);
disconnect from db1;
quit; WARNING: Variable CAUSALPARTNOPREFIX already exists on file WORK.TEST. NOTE: Table WORK.TEST created, with 3 rows and 58 columns. However, same table can be accessed directly from DB2 client environment(client(SAS EG/DI) mac and SAS Compute mac) without any issues. Other SAS environment(QA) not having any issues while accessing the same table, which pointed to same data source name so I suspect some problem with SAS environment(production) but not able to pin point where the problem is Thanks,
... View more
11-22-2018
03:41 AM
Hi Everyone, Error below throwing while accessing the db2 table ERROR: This DBMS table or view cannot be accessed by the SAS System because it contains column names that are not unique when a SAS normalized (uppercased) compare is performed. See "Naming Conventions" in the SAS/ACCESS documentation. Analysis: DB2 table has got 2 different columns having names as CAUSALPARTNOPREFIX, and CAUSALPARTNOPREFIXWID. The column CAUSALPARTNOPREFIXWID length exceeding 18 characters so SAS reading first 18 characters only. SAS reading the same column until last week without any issues, and we have the issue above beginning of 19th Nov 2018. We have 2 different environments qa and production. Above issue persists with production environment only, so I did compare the environments using the steps below %macro cmp(command);
filename p pipe &command lrecl=32767;
data _null_;
infile p;
input;
put _infile_;
run;
%mend;
option LS=256;
%cmp("set"); And noticed one of the environment variables IBM_DB_HOME missing in production, added it to the sasuser and ensure it appear back in production but it didn't work tried to execute the below in both the environments for comparison but no success yet. options sastrace=',,d,d' sastraceloc=saslog nostsuffix;
proc sql;
describe table tstpro.claimjob_ccd;
quit; ACCESS ENGINE: Exiting dbidsci with ERROR, rc=0x80121023 production environment showing the above error, but no details available on web to explore Any tips to debug further would be appreciated? Thanks,
... View more
10-02-2018
04:24 PM
Check the methods below, if it could be helpful in debugging proc metaoperate
server="sasmetadataservername"
port=8561
userid="id"
password="pwd"
action=status;
run; recently we had fixed the access issue for Ad users using the links below, above method was helpful to ensure sas meta config changes were effective http://support.sas.com/kb/39/891.html http://support.sas.com/kb/51/911.html Did u manage to find fix?
... View more
10-02-2018
07:52 AM
Program below would be helpful in consolidating the variables that has got only missing values across the library proc sql noprint;
create table tblscn as select memname, libname from dictionary.tables where libname='TMP';
quit;
%macro loop;
%local num i mv_memname mv_libname;
%let dsid = %sysfunc(open(work.tblscn));
%let num = %sysfunc(attrn(&dsid,nlobs));
%let rc = %sysfunc(close(&dsid));
%do i=1 %to #
data _null_;
p = &i;
set work.tblscn point=p;
call symputx('mv_memname',memname);
call symputx('mv_libname',libname);
stop;
run;
data _null_;
set &mv_libname..&mv_memname. end=last;
array _a{*} $ _character_;
array _b{*} _numeric_;
length varname memname libname $32;
if _n_ = 1 then do;
call missing(var, varn, varname, memname, libname);
declare hash nmc (ordered:"a");
nmc.definekey("var");
nmc.definedata('var','varname','memname','libname');
nmc.definedone();
declare hiter iter('nmc');
do var=1 to dim(_a);
nmc.add(key:var,data:var,data:vname(_a[var]),data:"&memname.",data:"&libname.");
put _all_;
end;
declare hash nmn (ordered:"a");
nmn.definekey("varn");
nmn.definedata('varn','varname','memname','libname');
nmn.definedone();
declare hiter niter('nmn');
do varn=1 to dim(_b);
nmn.add(key:varn,data:varn,data:vname(_b[varn]),data:"&memname.",data:"&libname.");
put _all_;
end;
end;
rc=iter.first();
do while (rc = 0);
if _a[var] ne '' then do;
_x=var;
rc=iter.next();
nmc.remove(key:_x);
end;
else rc=iter.next();
end;
rc=niter.first();
do while (rc = 0);
if _b[varn] ne . then do;
_y=varn;
rc=niter.next();
nmn.remove(key:_y);
end;
else rc=niter.next();
end;
if last then do;
nmc.output(dataset:'mc_ds');
nmn.output(dataset:'mcn_ds');
end;
run;
proc append base=mc_ds data=mcn_ds(rename=varn=var) force; run;
proc append base=all_missing_var data=mc_ds force; run;
%end;
%mend loop;
%loop;
... View more
10-01-2018
01:42 AM
Partitioning technique is reducing the elapsed time by 50 to 75% (60 mins reduced to 15 mins, another place 80 mins reduced to 25 mins). Out of which, most of the time getting consumed for overhead processes like below... 1) preparing the source data(creating formats, sort the source data, and determining the partition size(when partitions based on business keys) etc.,) 2) consolidate the partitioned target data into a single dataset 3) cleanup the intermediate tables We have 1 processor chip and other CPU resources are listed below... x cpuinfo: GenuineIntel Intel(R) Xeon(R) CPU E5-4655 v3 @ 2.90GHz x x cpuinfo: Hz=3126.335 bogomips=5808.16 x x cpuinfo: ProcessorChips=1 PhyscalCores=6 x x cpuinfo: Hyperthreads =2 VirtualCPUs =48 x x # of CPUs: 48 Can you highlight the risks of load failures that may expect in future if the CPU utilization goes beyond 95%? I am more interested in knowing if making the CPU utilization upto 100% during off peak hours (for batch loads), when it is the only process getting executed. Can it be considered as optimal way of utilizing the resources? I had seen issues with work tables, and views(applied fix as replacing these with permanent tables). SAS had difficulties in closing the sessions when it had crated work tables for processing as highlighted in my previous post. I had executed it several times and ensure no errors thrown. I wanted to be ready with fixes in advance, if any issue can be expected in future
... View more
09-30-2018
01:58 PM
Thanks you very much for detailed response! I would like to know few more insights about memory and cpu utilization. If I allow the sas programs to process couple of billion records, free memory varies between 0 to 1% till the process gets completed but free CPU wouldn't go low beyond 95%. I applied partitioning technique for the same amount of data using the dataset options firstobs and obs, and allowed all the partitions to trigger concurrently.. I could see the jobs getting completed without any issue and noticed both memory and cpu free percent is less than 1% until the process completed. Can I consider, making the cpu utilisation beyong 95% is optimal way of processing the data? Do you foresee any risks of load failures in such conditions? I had to ensure the conditions below to make partitioning technique successful 1) No temporary dataset allowed in the SAS program 2) No views exist in the SAS program 3) Distributed the partitions uniformly either by business key or total obs Thanks again!
... View more
09-30-2018
12:08 PM
1 Like
data want(drop=lID lCatg_1 lCatg_2);
set have;
by id;
lID=lag(ID);
lCatg_1=lag(Catg_1);
lCatg_2=lag(Catg_2);
if lid eq id then do;
if lCatg_1 eq Catg_1 and lCatg_2 eq Catg_2 then delete;
if lCatg_1 ne Catg_1 and lCatg_2 eq Catg_2 then Catg_2='';
if lCatg_1 eq Catg_1 and lCatg_2 ne Catg_2 then Catg_1='';
if Catg_1 eq '' and Catg_2 eq '' then delete;
/* if lCatg_2 eq Catg_2 then delete; */
end;
if not (first.id and last.id) and Catg_1 eq '' and Catg_2 eq '' then delete;
... View more
09-28-2018
01:03 PM
I wanted to compare the IO Rate between "performance of dd or cp command Linux" and "SAS procedures or data step" when I have executed the step below it consumed 18 mins to process datasets have_a and have_b. where each datasite sizes upto 195 gb so can I consider SAS IO rate here is 368 mbps. It looks optimal when compare to processing in Linux, where SAS compute got installed.. data work.censusmerge; set have_a have_b; by key; run; NOTE: There were 2176165646 observations read from the data set have_a. NOTE: There were 2176165646 observations read from the data set have_a. NOTE: The data set WORK.CENSUSMERGE has 4352331292 observations and 18 variables. NOTE: DATA statement used (Total process time): real time 18:05.02 user cpu time 9:59.07 system cpu time 8:05.97 memory 24984.06k OS Memory 45516.00k Timestamp 09/28/2018 09:27:49 PM Step Count 6 Switch Count 1492 Page Faults 0 Page Reclaims 102891 Page Swaps 0 Voluntary Context Switches 5048 Involuntary Context Switches 1313 Block Input Operations 785244120 Block Output Operations 817162272
... View more
09-28-2018
11:13 AM
Hello Everyone! When I have copied a sas file in linux using the command dd if=have.sas7bdat of=test.sas7bdat to know the io throughput rate of environment and noticed speed varies from 250 mbps to 400 mbps. 209194057728 bytes (209 GB) copied, 834.332 s, 251 MB/s Whereas while I try to sort the data file using proc sort data=have; by key; run; it sorted the data in 621 seconds. Table sizes upto 195 gb so shall I consider SAS io rate as 321 mbps. Am I measuring the performance right way? Can you show some light in measuring the sas performance right way in standard manner like using above dd command in linux environment? Thanks in advance!
... View more
09-27-2018
04:06 AM
data _null_;
set have end=last;
array _a{*} $ _character_;
length name $12;
if _n_ = 1 then do;
call missing(idx, name);
declare hash miss (ordered:"a");
miss.definekey("idx");
miss.definedata('idx','name');
miss.definedone();
declare hiter iter('miss');
do idx=1 to dim(_a);
miss.add(key:idx,data:idx,data:vname(_a[idx]));
end;
end;
rc=iter.first();
do while (rc = 0);
if _a[idx] ne '?' and _a[idx] ne '' then do;
_x=idx;
rc=iter.next();
miss.remove(key:_x);
end;
else rc=iter.next();
end;
if last then miss.output(dataset:'drop');
run; I wanted to drop the columns _a1.._a3 from the sample dataset "have".. since the columns _a1.._a3 has got the values as "x" across the table.. changing the line as if _a[idx] ne 'x' then do; worked fine.. and processing time saved more than 50%. Master piece... Thank you very much!
... View more
09-26-2018
02:27 PM
should get rid of the variable if it has only missing values or specific string like '?' across the table
... View more
09-26-2018
02:19 PM
didn't work.. can you revisit the steps? processing time reduced from 16 mins to 4 mins but output not listed the columns that has only string ?
... View more
09-26-2018
12:31 PM
Hello Everyone, I see many posts around this concept, but I have unique case as I have 6000 variables approximately in a table and it has 2 million records sizing up to 100 GB. I wanted to drop the columns, if a column has either missing or a single value like '?' I have been using the following program, which isn't quite optimal. I have been trying to prevent accumulate these columns as an alternative. Can you suggest the better options, if you think of any? 1000+ columns exist with either missing or ? out of 6000 columns data _null_;
set have end=last;
if _n_ = 1 then do;
declare hash cnt();
rc = cnt.definekey('nmc');
rc = cnt.definedata('nmc');
rc = cnt.definedone();
end;
array _a{*} $ _character_;
do i=1 to dim(_a);
if _a{i} ne '?' and _a{i} ne '' then do;
nmc=vname(_a{i});
rc = cnt.find();
if rc then do;
cnt.add();
end;
if last then
rc = cnt.output(dataset: "nmc_bo");
end;
end;
run;
proc sql noprint;
select name into :mv_mc separated by ',' from dictionary.columns where libname='WORK' and memname='HAVE' and type='char' and name not in
(select nmc from nmc_bo);
quit;
%put Note: Columns that will be discarded are &mv_mc;
proc sql noprint;
alter table have drop &mv_mc.;
quit;
proc datasets lib=work nolist;
delete nmc_bo;
quit;
... View more
09-24-2018
03:14 PM
I have 2 different tables and both the tables have couple of billion records.. It is required to fetch the records from table1, if the record doesn't exist in table2. I felt it is best to find the missing values in the range that belongs to table2. Fianlly fetch the records from table1 if it matches with missing values or beyond the range. Thanks for your prompt response!
... View more