Solved: Re: BASE SAS

molla · Posted 07-23-2017 03:03 PM

Hi,

Below attached file is the interview question which I faced,below is the code which I have developed for the same,let me know if that code is correct or any changes are required ,pld help me out regarding this.

data test1;
input CustomerId $1-5 DomicileCountryCode $7-9
IncorporationCountryCode $11-13 ParentCustomerId $15-20 CustomerType $23-27;
cards;
A1111 SG X1111 BANK
B2222 HK X1111 BANK
X1111 US US
X2222 TW TW
C3333 CN X2222 CORP
;
run;

data test2;
input CustomerId $ ScoreCardId CustomerCreditGrade $
ScoreCardDate ScoreCardApprovedFlag $;
informat ScoreCardDate ddmmyy10.;
format ScoreCardDate ddmmyy10.;
cards;
X1111 32104 AA+ 31/12/2016 Y
X1111 32105 AAA 31/03/2017 N
X1111 32103 AA 30/6/2016 Y
X1111 32102 AA- 31/03/2016 Y
X2222 40321 BBB 31/12/2016 N
X2222 40322 BBB+ 31/12/2016 Y
;
run;

proc sql;
create table test3 as
select CustomerId,ScoreCardId,CustomerCreditGrade,ScoreCardDate,
ScoreCardApprovedFlag
from test2 where ScoreCardApprovedFlag='Y' ;
quit;

proc sql;
create table test4 as
select CustomerId,ScoreCardId,CustomerCreditGrade,ScoreCardDate,
ScoreCardApprovedFlag
from test3 group by CustomerId
having ScoreCardDate=max(ScoreCardDate) ;
quit;

proc sql;
create table test5 as
select b.* ,a.CustomerType
from test1 a,test4 b where a.ParentCustomerId=b.CustomerId;
quit;

proc sql;
create table test6 as
select b.* ,a.IncorporationCountryCode
from test1 a,test5 b where a.CustomerId=b.CustomerId;
quit;

proc sql;
create table final as
select distinct a.CustomerId,b.ScoreCardId,b.CustomerCreditGrade,a.DomicileCountryCode,
b.IncorporationCountryCode,b.CustomerType
from test1 a right join test6 b
on a.ParentCustomerId=b.CustomerId
or a.CustomerId=b.CustomerId group by b.CustomerCreditGrade ;
quit;

gamotte · Posted 07-24-2017 04:35 AM

Hello,

I did not check your code but maybe the fact that you use many intermediary datasets influenced the decision.

Using too many steps to perform one task make the code difficult to read and maintain because it makes it

hard to keep track of the contents of each dataset. Also, naming your datasets test1 and test2 is not a good idea.

Try to use variable and dataset names that reflects their actual contents.

proc sql noprint;
	CREATE TABLE want AS
	SELECT a.CustomerId, 
		   c.ScoreCardId, c.CustomerCreditGrade, 
           COALESCE(a.DomicileCountryCode,b.DomicileCountryCode) AS DomicileCountryCode, 
           COALESCE(a.IncorporationCountryCode,b.IncorporationCountryCode) AS IncorporationCountryCode
	FROM Customer a
	LEFT JOIN Customer b
	ON a.ParentCustomerId=b.CustomerId
	LEFT JOIN ScoreCard (WHERE=(ScoreCardApprovedFlag="Y")) c
	ON strip(c.CustomerId)=strip(COALESCE(a.ParentCustomerId, a.CustomerId))
	GROUP BY a.CustomerId
	HAVING ScoreCardDate=max(ScoreCardDate);
quit;

View solution in original post

Reeza · Posted 07-23-2017 05:34 PM

Are you being asked to solve these independently as part of a job interview?

As someone who hires and administers these types of test I know what would happen if saw the questions/posts on here.

@molla wrote:

Hi,

Below attached file is the interview question which I faced,below is the code which I have developed for the same,let me know if that code is correct or any changes are required ,pld help me out regarding this.

data test1;
input CustomerId $1-5 DomicileCountryCode $7-9
IncorporationCountryCode $11-13 ParentCustomerId $15-20 CustomerType $23-27;
cards;
A1111 SG X1111 BANK
B2222 HK X1111 BANK
X1111 US US
X2222 TW TW
C3333 CN X2222 CORP
;
run;

data test2;
input CustomerId $ ScoreCardId CustomerCreditGrade $
ScoreCardDate ScoreCardApprovedFlag $;
informat ScoreCardDate ddmmyy10.;
format ScoreCardDate ddmmyy10.;
cards;
X1111 32104 AA+ 31/12/2016 Y
X1111 32105 AAA 31/03/2017 N
X1111 32103 AA 30/6/2016 Y
X1111 32102 AA- 31/03/2016 Y
X2222 40321 BBB 31/12/2016 N
X2222 40322 BBB+ 31/12/2016 Y
;
run;

proc sql;
create table test3 as
select CustomerId,ScoreCardId,CustomerCreditGrade,ScoreCardDate,
ScoreCardApprovedFlag
from test2 where ScoreCardApprovedFlag='Y' ;
quit;

proc sql;
create table test4 as
select CustomerId,ScoreCardId,CustomerCreditGrade,ScoreCardDate,
ScoreCardApprovedFlag
from test3 group by CustomerId
having ScoreCardDate=max(ScoreCardDate) ;
quit;

proc sql;
create table test5 as
select b.* ,a.CustomerType
from test1 a,test4 b where a.ParentCustomerId=b.CustomerId;
quit;

proc sql;
create table test6 as
select b.* ,a.IncorporationCountryCode
from test1 a,test5 b where a.CustomerId=b.CustomerId;
quit;

proc sql;
create table final as
select distinct a.CustomerId,b.ScoreCardId,b.CustomerCreditGrade,a.DomicileCountryCode,
b.IncorporationCountryCode,b.CustomerType
from test1 a right join test6 b
on a.ParentCustomerId=b.CustomerId
or a.CustomerId=b.CustomerId group by b.CustomerCreditGrade ;
quit;

molla · Posted 07-24-2017 02:27 AM

Hi,When I faced the interview this is the question which they asked,the interview process is completed,I was not selected ,am trying to solve this questions and trying to knw the mistakes whcih i have did,so that the same is not repeated further,trying to improve my programming skills also.

Kurt_Bremser · Posted 07-24-2017 02:12 AM

Using "BASE SAS" as the title of a post in the Base SAS Programming forum on the SAS communities website is not a sign of overwhelming creativity.

When you posted your question, you saw this (among other useful hints):

✔ Have a descriptive subject line, i.e., How do I ‘XYZ’?

Please take not of that.

An interview is made to get a picture of the candidate's capabilities of the subject in question. The fact that you need to go here to solve that interview question is a clear sign that you are not yet ready for the job offered.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

molla · Posted 07-24-2017 02:37 AM

I was not selected, am trying to solve that,I have written the code,but am not sure whether that is the correct one or not,so I have posted that so that I can know whther that code is correct or not and If wrong am able to know in waht way that is wrong ,that may be helpful for me in further interviews.

Shmuel · Posted 07-24-2017 04:35 AM

If you have a computer at home check if you can install and use SAS University Edition.

It is free but have its minimum requirements to install, like 64 BIT and others.

When you be able to run your code (on your PC or on some other's) you will be able too

to check your code. Then if you cannot fix it, return to the forum.

gamotte · Posted 07-24-2017 04:35 AM

Hello,

I did not check your code but maybe the fact that you use many intermediary datasets influenced the decision.

Using too many steps to perform one task make the code difficult to read and maintain because it makes it

hard to keep track of the contents of each dataset. Also, naming your datasets test1 and test2 is not a good idea.

Try to use variable and dataset names that reflects their actual contents.

proc sql noprint;
	CREATE TABLE want AS
	SELECT a.CustomerId, 
		   c.ScoreCardId, c.CustomerCreditGrade, 
           COALESCE(a.DomicileCountryCode,b.DomicileCountryCode) AS DomicileCountryCode, 
           COALESCE(a.IncorporationCountryCode,b.IncorporationCountryCode) AS IncorporationCountryCode
	FROM Customer a
	LEFT JOIN Customer b
	ON a.ParentCustomerId=b.CustomerId
	LEFT JOIN ScoreCard (WHERE=(ScoreCardApprovedFlag="Y")) c
	ON strip(c.CustomerId)=strip(COALESCE(a.ParentCustomerId, a.CustomerId))
	GROUP BY a.CustomerId
	HAVING ScoreCardDate=max(ScoreCardDate);
quit;

gamotte · Posted 07-24-2017 04:37 AM

I did not do it in my example but such code should also be commented to explain in a few word its purpose.

andreas_lds · Posted 07-24-2017 07:20 AM

This is, of course, nonsense, especially when posting a proc sql.

@gamotte wrote:

Hello,

I did not check your code but maybe the fact that you use many intermediary datasets influenced the decision.

Using too many steps to perform one task make the code difficult to read and maintain because it makes it

hard to keep track of the contents of each dataset.

RW9 · Posted 07-24-2017 04:37 AM

"The fact that you need to go here to solve that interview question is a clear sign that you are not yet ready for the job offered." - I dont entirely agree with that statement. I think, when in an interview situation (in fact more than day to day) due to the nature of not having the right tools or the ability to iterate a process until you get it working, showing the knowedge of where to search for relevant information can be quite helpful. As we always say RTFM, i.e. I don't know every command in SAS off the top of my head or its syntax, however I do know to the required information by searching manual or sites like this.

Of course, that would change if it is basic concepts in SAS or logic which is the problem.

As for the OP question, seems to me to be a simple:
- sort the scorecard data using nodupkey where =Y to get the latest Y record

- left join this information to the customer data based on (customer_id=customer_id) or (customer_id=parent_customer_id).

SASUser_22 · Posted 03-26-2018 10:46 AM

/*Few cells are missing in the original data, I replaced them with null values*/

data Customer;
infile datalines missover;
input CustomerId$ DomicileCountryCode $ IncorporationCountryCode $ ParentCustomerId $ CustomerType $;
datalines;
A1111 SG . X1111 BANK
B2222 HK . X1111 BANK
X1111 US US . .
X2222 TW TW . .
C3333 CN . X2222 CORP
;
run;

data Scorecard (where=(ScoreCardApprovedFlag='Y'));
input CustomerId $ ScoreCardId CustomerCreditGrade $
ScoreCardDate ScoreCardApprovedFlag $;
informat ScoreCardDate ddmmyy10.;
format ScoreCardDate ddmmyy10.;
cards;
X1111 32104 AA+ 31/12/2016 Y
X1111 32105 AAA 31/03/2017 N
X1111 32103 AA 30/6/2016 Y
X1111 32102 AA- 31/03/2016 Y
X2222 40321 BBB 31/12/2016 N
X2222 40322 BBB+ 31/12/2016 Y
;
run;

proc sort data = Scorecard ;
by CustomerId descending ScoreCardDate ;
run;

proc sort data =Scorecard nodupkey;
by customerid;
run;

/*Manually passing the values of incorporation country code, since they are not given for all the customers*/
data Scorecard;
set Scorecard;
if customerid= 'X1111' then IncorporationCountryCode = 'US';
if customerid= 'X2222' then IncorporationCountryCode = 'TW';
run;

data customer2;
set Customer;
new=COALESCEc(ParentCustomerId,CustomerId);
run;

proc sql;
create table final as
select
a.CustomerId,
b.ScoreCardId,
b.CustomerCreditGrade,
a.DomicileCountryCode,
b.IncorporationCountryCode

from customer2 a
left join Scorecard b
on a.new = b.customerid
order by 1;
quit;

Classroom Training Available!