Home Forums ASROL : Rolling Window and by-Group Descriptive Statistics Find Standard Deviation and Mean without a focal firm in Stata Reply To: Find Standard Deviation and Mean without a focal firm in Stata

Attaullah Shah
Keymaster
Post count: 69

The xf is an abbreviation that I use for “excluding focal”. There might be circumstances where we want to exclude the focal observation while calculating the required statistics. asrol allows excluding focal observation with two flavors. The first one is to exclude only the current observation while the second one is to exclude all observation of the relevant variable if there are similar (duplicate) values of the rangevar elsewhere in the given window. An example will better explain the distinction between the two options. Consider the following data of 5 observations, where X is the variable of interest for which we would like to calculate arithmetic mean and year is the rangevar. Our calculations do not use any rolling window, therefore the option window is dropped.

Example A:

asrol X, stat(mean) xf(focal) gen(xfocal)

Example B:

asrol X, stat(mean) xf(year) gen(xfyear)
        
          +---------------------------------------+
          | year     X        xfocal       xfyear |
          |---------------------------------------|
          | 2001   100          350           350 |
          | 2002   200          325           325 |
          | 2003   300          300     266.66667 |
          | 2003   400          275     266.66667 |
          | 2004   500          250           250 |
          +---------------------------------------+

Explanation :

In Example A, we invoke the option xf() as xf(focal). asrol generates a new variable xfocal that contains the mean values of the rest of the observations in the given window, excluding the focal observation. Therefore, in the year 2001, xfocal variable has a value of 350, which is the average of the values of X in the years 2002, 2003, 2003, 2004 i.e. (200+300+400+500)/4 = 350. Similarly, the second observation of the xfocal variable is 325, which is (100+300+400+500)/4 = 325. Similar calculations are made when required statistics are estimated in a rolling window.

Example B differs from Example A in definition of the focal observation(s). In Example B, we invoke the option xf() as xf(year), where year is an existing numeric variable. With this option, the focal observation(s) is(are) defined as the current observation and other observations where the focal observation of the rangevar has duplicates. Our data set has two duplicate values in the rangevar, i.e., year 2003. Therefore, the mean values are calculated as shown bellow:

        +-------------------------------------------------------+       
        |       obs 1: (200 + 300 + 400 + 500)/4 = 350          |
        |       obs 2: (100 + 300 + 400 + 500)/4 = 325          |                       
        |       obs 3: (100 + 200 + 500 )     /3 = 266.66667    |               
        |       obs 4: (100 + 200 + 500 )     /3 = 266.66667    |       
        |       obs 5: (100 + 200 + 300 + 400)/4 = 250          |                       
        +-------------------------------------------------------+      

Example using a rolling window

*Rolling mean with minimum number of observaton while excluding focal observation

 
 webuse grunfeld
 
    bys company: asrol invest, stat(mean) win(year 4) xf(focal)