About hkgejd

hkgejd · ‎07-21-2022

Hi sbxkoenk Thanks for your help! 😀 You are right, I don't have the SAS/ETS-package. How would you "translate" the following PROC PANEL to one of the mentioned procedures? proc panel data=data; id id t; model lwage = exp exp2 wks ed /ranone; run;

hkgejd · ‎07-20-2022

Hi everyone I want to do a regression analysis with random effects on a panel dataset. I know it is possible with the help of the 'proc panel'-procedure, but unfortunately I don't have that package. Is it possible to do this without the 'proc panel'-procedure? To those who are familiar with R, I want to do something similar with: "data("Produc", package = "plm") zz <- plm(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp, data = Produc, index = c("state","year")) summary(zz)" Thanks in advance 😀

hkgejd · ‎08-31-2021

Hi maguiremq Thanks for your reply! I tried your suggestion, but I could not really get it to work... If you don't mind, you could try it out? The shapes I'm trying to spatial match are from this site https://www.efgs.info/data/ under "Denmark". I did have luck using "proc mapimport". I have taken a subset of this: data WORK.SHAPE; infile datalines dsd truncover; input X 7. Y 9. SEGMENT 1. GRD_NEWID:$25.; datalines; 4442000 3519000 1 1kmN3519E4441 4441000 3519000 1 1kmN3519E4441 4441000 3520000 1 1kmN3519E4441 4442000 3520000 1 1kmN3519E4441 4442000 3519000 1 1kmN3519E4441 ;;;; run; The dataset I want to match it with is the following: data WORK.EDMT; infile datalines dsd truncover; input DDKNcelle1km $12. wgs84_bredde 14. wgs84_laengde 12.; datalines; 1km_6072_684 54.765037537 11.871144295 1km_6072_684 54.765060425 11.871236801 1km_6072_684 54.765117645 11.871257782 1km_6072_684 54.765144348 11.871356964 1km_6072_684 54.765148163 11.870749474 1km_6072_684 54.765159607 11.870833397 1km_6072_684 54.765193939 11.871382713 1km_6072_684 54.765197754 11.871330261 1km_6072_684 54.765220642 11.870867729 ;;;; run;

hkgejd · ‎08-27-2021

Hi everyone Does anyone know, if it is possible to spatial match? What I mean is that I have a shape-file with some polygons, and I have another file with a lot of different coordinates. I want to merge the coordinate-dataset with the shape-file, so the coordinates can get the right polygon. Does my question make sense? Thanks in advance 😀

hkgejd · ‎03-01-2021

Hi @FreelanceReinh That is a very nice workaround! I will have to test this on my "real" data, but I think this will be way faster than my current method. Thank you so much 😀

hkgejd · ‎02-26-2021

Hi ballardw Thanks for your reply! That makes sense, I will convert the excel-file to a data step code immediately 😀

hkgejd · ‎02-26-2021

Hello everyone This is not a "real" problem, since I already know another way to do it, but I am looking for a more efficient method. For context, what I am doing is trying to benchmark rental units with eachother. So, the user is picking a rental unit in the data, and then it will benchmark itself against the closest neighbours in the dataset. Assume that my dataset is the "Have"-sheet in the attached excel-file. So I have picked the address called "TEST-address 1" from my dataset and calculated the distance from the picked address to the other addresses (neighbours). What I want now is to reduce the dataset, so it only consists of "relevant" neighbours. The closest neighbours in the dataset are obviously the most relevant neighbours, but how are those defined? I have 4 conditons that have to be met 1) there has to be at least 10 different addresses (neighbours) 2) there has to be at least 3 different owners 3) there has to be at least 5 different properties 4) one owner cannot be too dominant in terms of values - one owner can at least have a share of 90% of the value. E.g. if the total value in the dataset is $100 and owner A has $95 then owner A has 95% of the share, which is too dominant. My current method is to do a do-loop. So when I have picked an address the program will be looking for the closest neighbours one at a time, and stop looking when all the above conditions has been met. In the "Want-sheet" it should stop at row 12. However, there is a rule that if the next addresses are in the same property, then it should stop when all the addresses in the property are in there. Hence, it should stop at row 13 instead. However, I want to test, if it is more efficient to do it the other way around. Instead of evaluating the conditions one neighbour at a time, I want the conditions to already be calculated as variables for each neighbour, like I have done in the "want"-sheet. So e.g. the 13th row states that if the 13th row is included in the final dataset then there will be 5 owners, 11 addresses and 5 properties in the dataset, and the owner with the largest share of value has a share of 50%. It is quite easy to calculate the number of owners, neighbours and properties, but the "max owner share" is difficult. Just to illustrate, how the "max owner share" should be calculated, I have created the yellow columns. The yellow columns could be an alternative method to do it, but the problem is that I would then have to create several new columns. In my original dataset there are over 400 different owners, meaning that I would have to create over 400 new columns. Does anyone have any ideas/suggestions on, how to make this more efficent? Thanks! 😀 Edit: Thanks to @ballardw, I have converted the excel-file to a data step code. The mentioned yellow columns are all the variables after "Value". I hope it makes sense 😀 data WORK.WANT; infile datalines dsd truncover; input Owner $5. address $15. Distance_to_picked_address commax5. Picked_owner $4. Picked_address $14. Property $6. Value 5. no_of_owners 3. no_of_addresses 3. no_of_properties 2. max_owner_share commax5. acc_value 6. TEST1 5. TEST2 4. TEST3 4. TEST4 4. TEST5 5. TEST6 5.; datalines; TEST TEST-address1 0.0 TEST TEST-address1 PROP 1000 0 0 0 0% 0 0 0 0 0 0 0 TEST1 TEST1-address1 0.1 TEST TEST-address1 PROP1 1025 1 1 1 100% 1025 1025 0 0 0 0 0 TEST1 TEST1-address2 0.1 TEST TEST-address1 PROP1 700 1 2 1 100% 1725 1725 0 0 0 0 0 TEST2 TEST2-address1 0.2 TEST TEST-address1 PROP2 625 2 3 2 73% 2350 1725 625 0 0 0 0 TEST3 TEST3-address1 0.3 TEST TEST-address1 PROP3 375 3 4 3 63% 2725 1725 625 375 0 0 0 TEST3 TEST3-address2 0.3 TEST TEST-address1 PROP3 500 3 5 3 53% 3225 1725 625 875 0 0 0 TEST4 TEST4-address1 0.4 TEST TEST-address1 PROP4 250 4 6 4 50% 3475 1725 625 875 250 0 0 TEST4 TEST4-address2 0.4 TEST TEST-address1 PROP4 250 4 7 4 46% 3725 1725 625 875 500 0 0 TEST5 TEST5-address1 0.5 TEST TEST-address1 PROP5 525 5 8 5 41% 4250 1725 625 875 500 525 0 TEST5 TEST5-address2 0.5 TEST TEST-address1 PROP5 500 5 9 5 36% 4750 1725 625 875 500 1025 0 TEST5 TEST5-address3 0.5 TEST TEST-address1 PROP5 1625 5 10 5 42% 6375 1725 625 875 500 2650 0 TEST5 TEST5-address4 0.5 TEST TEST-address1 PROP5 1075 5 11 5 50% 7450 1725 625 875 500 3725 0 TEST6 TEST6-address1 0.6 TEST TEST-address1 PROP6 875 6 12 6 45% 8325 1725 625 875 500 3725 875 TEST6 TEST6-address2 0.6 TEST TEST-address1 PROP6 1000 6 13 6 40% 9325 1725 625 875 500 3725 1875 TEST6 TEST6-address3 0.6 TEST TEST-address1 PROP7 1400 6 14 7 35% 10725 1725 625 875 500 3725 3275 TEST6 TEST6-address4 0.6 TEST TEST-address1 PROP7 1050 6 15 7 37% 11775 1725 625 875 500 3725 4325 TEST1 TEST1-address3 1.0 TEST TEST-address1 PROP8 1675 6 16 8 32% 13450 3400 625 875 500 3725 4325 TEST1 TEST1-address4 1.0 TEST TEST-address1 PROP8 800 6 17 8 30% 14250 4200 625 875 500 3725 4325 TEST1 TEST1-address5 1.0 TEST TEST-address1 PROP8 625 6 18 8 32% 14875 4825 625 875 500 3725 4325 ;;;; run; data have; set want; keep Owner address Distance_to_picked_address Picked_owner Picked_address Property Value; run;

hkgejd · ‎05-20-2020

Worked like a charm! Thank you for your quick response 😁

hkgejd · ‎05-20-2020

Hi How do you do the following? data have; input group id; datalines; TEST 1 TEST 2 TEST 3 TEST . TEST . TEST2 1 TEST2 2 TEST2 . ; run; data want; input group id; datalines; TEST 1 TEST 2 TEST 3 TEST 4 TEST 5 TEST2 1 TEST2 2 TEST2 3 ; run; Thanks in advance 😀

hkgejd · ‎09-25-2019

Hi Fredrik Thanks for your reply! It is a great alternative that you suggest, and I will go with that for now. I cross my fingers that someone knows how to do it in "my way".

hkgejd · ‎09-24-2019

Hello I am planning to create a lot of "individual" VA-reports that are identical except for the datasource. However, in the future there will be some updates and maintenance, which will be rather complicated since I will have to do it for every individual report. Hence, I want to know if it is possible to have one main report that I can update, and then it will automatically update all the individual reports? Thanks in advance

Online Status	Offline
Date Last Visited	‎07-28-2022 11:56 AM

Re: Panel data regression with random effects

Panel data regression with random effects

Re: Is it possible to "spatial match"?

Is it possible to "spatial match"?

Re: Finding the maximum share of a value

Re: Finding the maximum share of a value

Finding the maximum share of a value

Re: Fillng out missing values

Fillng out missing values

Re: One main report that controls individual reports

Re: Panel data regression with random effects

Re: Finding the maximum share of a value

Re: One main report that controls individual reports

Re: Panel data regression with random effects

Panel data regression with random effects

Re: Is it possible to "spatial match"?

Is it possible to "spatial match"?

Re: Finding the maximum share of a value

Re: Finding the maximum share of a value

Finding the maximum share of a value

Re: Fillng out missing values

Fillng out missing values

Re: One main report that controls individual reports

One main report that controls individual reports