BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
priyanka14
Fluorite | Level 6

Hello,

I need help on SAS EG Code for my below problem. I am usins SAS EG version 7.1

I have a dataset which includes few columns. The excel file is attached with the mail.

I have mentioned the input and output sheet ( the way i want my output).

My problem statment :

in input sheet, first i want to sort the month column by descending order for each category and then i have to first find out the number less than 3 in "used" column for each variable in "category" column . Then i have to select  all the rows which are below that number ( less than 3 , ex 2) . This i need to do for each variable in "category" column. 

 

 

I got stuck with this almost from 2 days and not even able to proceed further.

Your quick support will highly appreciated.

 

Thanks

Priyanka

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
tsap
Pyrite | Level 9

I'm not sure what the issue is with the prior solution. The dataset output in the end matches the exact requirements you just stated in your reply.

 

 

snapshot_dummydata.PNG

 

As you can clearly see, for Category A the output dataset has from Jul18th to Jan18. So based on your criteria provided, the code is performing as you need it to perform.All of the 'non-highlighted rows' in your screenshot are the observations on my screen shot. And I just ran my logic to generate this.

View solution in original post

9 REPLIES 9
tsap
Pyrite | Level 9
/*   DUMMY DATASET   */
DATA WORK.HAVE;
FORMAT 	 Category $1. 	Month MonYY5.	used 8.	new_used 8.	variety $3.;
INFORMAT Category $1. 	Month MonYY5.	used 8.	new_used 8.	variety $3.;
INPUT    Category	    Month	        used	new_used	variety;
INFILE DATALINES DLM='|' DSD;
DATALINES;
A|Jan-18|6|20|MI
B|Jan-18|8|20|MI
C|Jan-18|8|20|AZ
D|Jan-18|8|20|AZ
A|Feb-18|5|20|AP
B|Feb-18|12|20|AP
C|Feb-18|23|20|MI
D|Feb-18|23|20|MI
A|Mar-18|5|20|NOK
B|Mar-18|3|20|NOK
C|Mar-18|3|20|AP
D|Mar-18|3|20|AP
A|Apr-18|10|20|AC
B|Apr-18|18|20|AC
C|Apr-18|18|20|NOK
D|Apr-18|18|20|NOK
A|May-18|9|20|PS
B|May-18|5|20|PS
C|May-18|5|20|AC
D|May-18|5|20|AC
A|Jun-18|8|20|CD
B|Jun-18|6|20|CD
C|Jun-18|6|20|PS
D|Jun-18|6|20|PS
A|Jul-18|3|20|MI
B|Jul-18|7|20|MI
C|Jul-18|7|20|CD
A|Aug-18|2|2|EH
B|Aug-18|2|2|EH
C|Aug-18|2|2|MI
D|Aug-18|2|2|CD
A|Sep-18|12|20|EF
B|Sep-18|12|20|EF
C|Sep-18|12|20|EH
D|Sep-18|7|20|MI
A|Oct-18|12|20|AZ
C|Oct-18|10|20|EF
;


PROC SORT DATA=WORK.HAVE; BY Category DESCENDING Month; RUN;


/*   ID MAX DATE per Category where Used value LT 3  */
PROC SQL;
CREATE TABLE WORK.WANT_A	AS
	SELECT
		  Category
		, MAX(Month)	AS MAXMONTH FORMAT=MONYY5.

	FROM WORK.HAVE
	WHERE Used < 3
	GROUP BY Category
	ORDER BY Category, MAXMONTH;
QUIT;


/*   JOIN MAX DATE TABLE BACK TO FULL TABLE */
PROC SQL;
CREATE TABLE WORK.WANT_Final	AS
	SELECT
		a.*

	FROM		WORK.Have		AS a
	LEFT JOIN 	WORK.WANT_A		AS b	ON a.Category=b.Category
	WHERE a.MONTH < b.MAXMONTH;
QUIT;

priyanka14
Fluorite | Level 6

Thanks for the respone.

Using this code, it is only removing the rows where "Used" column is less than 3 . But my requirement is also to remove all those rows which is above less than 3 Category wise. I have also attached the ouput sheet , the way i want to have my output data.

I am attaching the snap also to make it more clear . The yellow highlighted rows , i want to remove and rest i want as my output. The output should look like the same (ex category A, Month from Jul18-Jan18,used column 3-6 ....like that). Category should not be mixed. out put : first non-highlighted rows of A, then B non highlighted rows like that i want in my output.

Want to remove the highlighted rows alsoWant to remove the highlighted rows alsoI hope this will help you to understand extactly what kind of output i am looking for.I guess may be we need to put some loop .

Thanks in advance. Looking for the code .

Smiley Happy

 

 

 

tsap
Pyrite | Level 9

I'm not sure what the issue is with the prior solution. The dataset output in the end matches the exact requirements you just stated in your reply.

 

 

snapshot_dummydata.PNG

 

As you can clearly see, for Category A the output dataset has from Jul18th to Jan18. So based on your criteria provided, the code is performing as you need it to perform.All of the 'non-highlighted rows' in your screenshot are the observations on my screen shot. And I just ran my logic to generate this.

priyanka14
Fluorite | Level 6

Thanks a lot.

 

It works . I missed a line of a code so I was facing the error.

 

Now I corrected it and it is working as per my expectations.

 

once again Thanks. Really appreciate the efforts.

 

Thanks

Priyanka

tsap
Pyrite | Level 9

You're welcome. Glad I could help.

priyanka14
Fluorite | Level 6

Hello ,

I am in urgent need of a SAS EG code. I need the same thing which I asked you earlier but the only thing is like now I added one more category. I have attached the excel sheet also which has input and the output which I required.

First of all , I need to sort the category, category2,Month in descending order . For each category , there are four category2 level ( 1,2,3,4).

 So My requirement is that for each category and its each category2 level , first I need check used column . and in used column , I have to find out the number greater than or equal to 3 and  need to select only the continuous rows. Suppose with Category 'A' having Category2 as '4', in "used"column the data are 12,12,2,3,8,10 , then I need to select all rows only from used column data 3 ( 3,8,10)

Same thing I have highlighted in" sorted data" that whatever the data is required in output sheet.

 

and if in some day I have one more category after category2 like Category3 and again the I need the same kind of output looking this added category also. then can you suggest me where I need to make changes.

 

In this attached sheet , the input sheet and then after sorting the input sheet by category,category2 and months and then the output sheet ( the way I want my result).

 

Thanks

Reeza
Super User
Show the code you tried.
priyanka14
Fluorite | Level 6
Hello,
With the addition of whatever I asked earlier. Now suppose for each category , there are 4 different sub-groups.like After this category column, I have another column let say group and for category A- there r 4 grps (G1,G2,G3,G4) and same with other Categories also( G1,G2,G3,G4 ) .
Now My requirement is same like earlier one.ie.want to sort the month column by descending order for each category with its group level and then i have to first find out the number less than 3 in "used" column for each variable in "category" along with group level column . Then i have to select all the rows which are below that number ( less than 3 , ex 2) . This i need to do for each variable in "category" and its group level.

I tried to create a new column by concatenate the category and group column and then follow the same code provided by you .
But it's not working in this case. Can you suggest me what else I can do.i tried for macro also but failed
priyanka14
Fluorite | Level 6

Looking for the solution if anyone can help me out