BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
kajal_30
Quartz | Level 8

Hello all 

I have a dataset like below (have) and need an output (want)

Please help me with the code in terms of proc sql or data step

TIA

HAVE WANT
 IDPRICEYEARMONTH IDPRE_PRICEPRE_YEARPOST_PRICEPOST_YEAR
 14554122312 145541223124554052404
 14554232402 245540623124554092410
 14554042403 345540724024554052409
 14554052404      
 24554062312      
 24554072402      
 24554082403      
 24554092404      
 24554052409      
 24554092410      
 34554072402      
 34554082403      
 34554092404      
 34554022405      
 34554032406      
 34554042407      
 34554042408      
 34554052409      
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Use a different selection criteria for when to remember the price.

data want;
  set have;
  by id;
  retain pre_price pre_year ;
  if first.id then call missing(pre_price,pre_year);
  if missing(pre_price) then do;
    pre_price=price;
    pre_year=yearmonth;
  end;
  if last.id;
  rename price=post_price yearmonth=post_year;
run;

View solution in original post

12 REPLIES 12
Tom
Super User Tom
Super User

Why do you need SQL code if the data step code works??

kajal_30
Quartz | Level 8

its a requirement from the client as they need most of the code in sql to be run on other applications with different language than SAS for comparing results.  

Tom
Super User Tom
Super User

@kajal_30 wrote:

its a requirement from the client as they need most of the code in sql to be run on other applications with different language than SAS for comparing results.  


Then look for solutions in that language.

What flavor of SQL are you talking about? Each one has its own added features and quirky implementations.  If you are lucky they are using a version of SQL that is newer than what SAS uses so they can use what are called "windowing" features of SQL.

kajal_30
Quartz | Level 8

I am talking about SAS SQL only we are not prioritizing other language our milestone is to get the SAS code working. So you are saying it is not possible in SAS to use sql to get this output ? 

 

yabwon
Onyx | Level 15

1) Since you are doing this in SAS, why not to use DATA STEP? It is _the_ tool to do such tasks. 

2) SQL does not support firs. and last. by design. You would have to do it differently. 

 

3) I would not recommend this method for a production use, but if you really have to you could try something like this:  

data have;
input ID	PRICE	YEARMONTH;
cards;
1	455412	2312
1	455423	2402
1	455404	2403
1	455405	2404
2	455406	2312
2	455407	2402
2	455408	2403
2	455409	2404
2	455405	2409
2	455409	2410
3	455407	2402
3	455408	2403
3	455409	2404
3	455402	2405
3	455403	2406
3	455404	2407
3	455404	2408
3	455405	2409
;
run;

proc print;
run;

proc sql;
  select y.ID
  ,sum(case when y.n=y.min_n then y.PRICE else . end) as PRE_PRICE
  ,sum(case when y.n=y.min_n then y.YEARMONTH else . end) as PRE_YEAR
  ,sum(case when y.n=y.max_n then y.PRICE else . end) as POST_PRICE
  ,sum(case when y.n=y.max_n then y.YEARMONTH else . end) as POST_YEAR
  from
  (
    select c.* 
    , min(n) as min_n
    , max(n) as max_n
    from
    (
      select x.ID, x.PRICE, x.YEARMONTH, count(distinct ym) as n
      from
      (
        select a.*, b.YEARMONTH as ym 
        from have as a
           , have as b
        where a.ID = b.ID 
          and a.YEARMONTH >= b.YEARMONTH
      ) as x
      group by x.ID, x.PRICE, x.YEARMONTH
    ) as c
    group by c.ID
  ) as y
  group by y.ID
  order by y.ID
  ;
quit;

Bart

 

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



kajal_30
Quartz | Level 8

Thank you can also please share the data step code and as my code is selecting all obs.

data want;
set have ;
by yearmonth ;
if first.yearmonth then pre_price = price and pre_year = yearmonth;
output;
if last.yearmonth then post_price = price and post_year = yearmonth ;
output;
run;

TIA

kajal

Tom
Super User Tom
Super User

I thought you wanted to group by ID, not by your yearmonth variable.

 

Here is how to share data (rather then just posting a listing).

data have;
  input ID PRICE YEARMONTH;
cards;
1 455412 2312
1 455423 2402
1 455404 2403
1 455405 2404
2 455406 2312
2 455407 2402
2 455408 2403
2 455409 2404
2 455405 2409
2 455409 2410
3 455407 2402
3 455408 2403
3 455409 2404
3 455402 2405
3 455403 2406
3 455404 2407
3 455404 2408
3 455405 2409
;

data expect ;
  input ID PRE_PRICE PRE_YEAR POST_PRICE POST_YEAR ;
cards;
1 455412 2312 455405 2404
2 455406 2312 455409 2410
3 455407 2402 455405 2409
;

So assuming the data is already sorted the data step to roll up the first/last month values per ID could look like this:

data want;
  set have;
  by id yearmonth;
  if first.id then do;
    pre_price=price;
    pre_year=yearmonth;
  end;
  retain pre_price pre_year;
  if last.id;
  rename price=post_price yearmonth=post_year;
run;

So let's check if we got what you expected.

proc compare data=want compare=expect;
  id id;
run;

Results

The COMPARE Procedure
Comparison of WORK.WANT with WORK.EXPECT
(Method=EXACT)

Data Set Summary

Dataset               Created          Modified  NVar    NObs

WORK.WANT    07OCT24:13:15:35  07OCT24:13:15:35     5       3
WORK.EXPECT  07OCT24:13:15:35  07OCT24:13:15:35     5       3


Variables Summary

Number of Variables in Common: 5.
Number of ID Variables: 1.


Observation Summary

Observation      Base  Compare  ID

First Obs           1        1  ID=1
Last  Obs           3        3  ID=3

Number of Observations in Common: 3.
Total Number of Observations Read from WORK.WANT: 3.
Total Number of Observations Read from WORK.EXPECT: 3.

Number of Observations with Some Compared Variables Unequal: 0.
Number of Observations with All Compared Variables Equal: 3.

NOTE: No unequal values were found. All values compared are exactly equal.
ballardw
Super User

My take on an SQL solution:

 

proc sql;
   create table work.want as
   select a.id, a.pre_price, a.pre_year, b.post_price, b.post_year
   from (
         select id, price as Pre_price, yearmonth as Pre_year
         from have
         group by id
         having yearmonth = min(yearmonth)
        ) as a
        left join
        (
         select id, price as post_price, yearmonth as post_year
         from have
         group by id
         having yearmonth = max(yearmonth)) as b
         on a.id=b.id
   ;

quit;

@kajal_30 wrote:

Thank you can also please share the data step code and as my code is selecting all obs.

data want;
set have ;
by yearmonth ;
if first.yearmonth then pre_price = price and pre_year = yearmonth;
output;
if last.yearmonth then post_price = price and post_year = yearmonth ;
output;
run;

TIA

kajal


Doesn't consider ID at all which your desired seems to imply the ID is important.

If you want the first and last observations you would have to RETAIN the PRE_ variables until you output with the last per ID.

Your example data as presented actually would throw errors because the data is not sorted by yearmonth. If it were, then you would have scrambled the ID values.

 

Note that eventually bad things come those you think a number like Yearmonth is a good way to handle date information.

 

kajal_30
Quartz | Level 8

Thank you so much multiple solution worked but I got one more challenge with the data as below.

can we please accommodate  the same

data have;
input ID PRICE YEARMONTH;
cards;
1       .         2312
1 455423 2402
1 455404 2403
1 455405 2404
2 455406 2312
2 455407 2402
2 455408 2403
2 455409 2404
2 455405 2409
2         .       2410
3 455407 2402
3 455408 2403
3 455409 2404
3 455402 2405
3 455403 2406
3 455404 2407
3 455404 2408
3 455405 2409
;
RUN;

data expect ;
input ID PRE_PRICE PRE_YEAR POST_PRICE POST_YEAR ;
cards;
1 455423 2402 455405 2404
2 455406  2312 .              2410
3 455407 2402 455405 2409
;
run;

 

while doing the first. we cannot have null value so pre_ value should go to next non null value but post_ value can have null. I am stuck as we can not simply put where price is not null as it will also eliminate null values for post

 

TIA

kajal 

  

ballardw
Super User

Since the requirement has changed then just make sure that the part that gets the minimum yearmonth value only looks at bits where the price has a value:

 

proc sql;
   create table work.want as
   select a.id, a.pre_price, a.pre_year, b.post_price, b.post_year
   from (
         select id, price as Pre_price, yearmonth as Pre_year
         from (select * from have where not missing(price))
         group by id
         having yearmonth = min(yearmonth) 
        ) as a
        left join
        (
         select id, price as post_price, yearmonth as post_year
         from have
         group by id
         having yearmonth = max(yearmonth)) as b
         on a.id=b.id
   ;

quit;

The more things related to order of values you impose the more likely that SQL answers, if possible, will become more complicated, hard to follow, and likely have much poorer performance.

Tom
Super User Tom
Super User

Use a different selection criteria for when to remember the price.

data want;
  set have;
  by id;
  retain pre_price pre_year ;
  if first.id then call missing(pre_price,pre_year);
  if missing(pre_price) then do;
    pre_price=price;
    pre_year=yearmonth;
  end;
  if last.id;
  rename price=post_price yearmonth=post_year;
run;
kajal_30
Quartz | Level 8

thank you so much

 

regards

kajal

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 639 views
  • 2 likes
  • 4 in conversation