Category Archives: Uncategorized

  • 0

Exporting ttest results from Stata to Word using asdoc

Category:asdoc,Blog,Stata Programs,Uncategorized

 

asdoc installation

If you have not already studied the features of asdoc, you can visit this page that lists the table of contents of what asdoc can do. You can also read this one paragraph introduction to asdoc. The following line of code will install asdoc from SSC

ssc install asdoc
help asdoc

 

Reporting t-tests with asdoc

Before we make the t-test results table for our example data, let us breifly explore the options available in asdoc for making a t-test results table.

Whether it is one-sample t-test or two-sample or other forms, asdoc manages to report the results line by line for each test. asdoc also allows accumulating results from different runs of t-tests. For this purpose, the option rowappend of asdoc really comes handy. With the sub-command ttest , we can use the following options of asdoc to control asdoc behavior.

(1) replace / append

(2) save(filename)

(3) title(text)

(4) fs(#)

(5) hide.

(6) stats()

(7) rowappend.

These options are discussed in detail in Section 1 of asdoc help file. Option stats and rowappend are discussed below:

 

Option stat()

Without stat() option, asdoc reports the number of observations (obs), mean, standard error, t-value, and p-value with t-tests. However, we can select all or few statistics using the stat option. The following table lists the keywords and their details for reporting the desired statistics.

keyword details
n Number of observations
mean Arithmetic mean
se Standard error
df degrees of freedom
obs Number of observations
t t-value
p p-value
sd standard deviation
dif difference in means if two-sample t-test

 

Option rowappned

ttest tables can be constructed in steps by adding results of different t-tests to an existing table one by one using option rowappend. There is only one limitation that the t-tests are performed and asdoc command applied without writing any other results to the file in-between.

 

An example

Suppose we have the following data set with variables r0, r1, r2, r3, and y. The data can be downloaded into Stata by

use http://fintechprofessor.com/ttests.dta, clear

The variables ro-r3 are the numeric variables for which we would like to conduct one-sample ttest whereas variable y is a numeric date variable that tracks years. We wish to conduct a ttest for each of the r0-r3 variables and in each year and make one table from all such tests.

 

Without using a loop

 asdoc ttest R0==0 if Y==2009, replace title(One Sample t-test Results)
asdoc ttest R1==0 if Y==2009, rowappend
asdoc ttest R2==0 if Y==2009, rowappend
asdoc ttest R3==0 if Y==2009, rowappend
asdoc ttest R0==0 if Y==2010, rowappend
asdoc ttest R1==0 if Y==2010, rowappend
asdoc ttest R2==0 if Y==2010, rowappend
asdoc ttest R3==0 if Y==2010, rowappend
asdoc ttest R0==0 if Y==2011, rowappend
asdoc ttest R1==0 if Y==2011, rowappend
asdoc ttest R2==0 if Y==2011, rowappend
asdoc ttest R3==0 if Y==2011, rowappend
asdoc ttest R0==0 if Y==2012, rowappend
asdoc ttest R1==0 if Y==2012, rowappend
asdoc ttest R2==0 if Y==2012, rowappend
asdoc ttest R3==0 if Y==2012, rowappend

And appreciate the results

 

Explanation

1.In the first line of code, we wrote asdoc ttest in the beggining of the line. This is how we use asdoc with Stata commands. We just add asdoc to the beggining of any Stata command and that’s all.

2. We used two options of asdoc in the first line of code: the replace and title(). Replace will replace any existing file with the name Myfile.doc and title will add the specific test as a title to the output file.

3. In the second line of code, we added option rowappend() that will append the results to the existing table in the file Myfile.doc

4. And the process continues untill all ttests are estimated.

 



  • 7

asdoc: Cutomizing the regression output | MS Word from Stata | Confidence Interval, adding stars, etc.

Category:asdoc,Blog,Uncategorized Tags : 

 

Version 2.3 of asdoc adds the following features for reporting detailed regression tables.

1. Reporting confidence interval

2. Suppressing confidence intervals

3. Suppressing the stars which are used to show significance level

4. Customization of significance level for stars

These features are discussed in details below. If you have not already studied the features of asdoc, you can visit this page that lists the table of contents of what asdoc can do. You can also read this one paragraph introduction to asdoc. The following line of code will install this beta version of asdoc from our website

net install asdoc, from(http://fintechprofessor.com) replace
help asdoc

   

Details of the new features

The new features related to creating detailed regression tables with asdoc are discussed below with details. 

 

1. Confidence interval

I received several emails and comments on blog posts suggesting the addition of confidence intervals (CI) to the detailed regression tables created by asdoc. In version 2.3 onwards, confidence intervals are shown by default. This means that we do not have to add an additional option to report CI. See the following example. 

sysuse auto, clear
asdoc reg price mpg rep78 headroom trunk weight length turn , replace

 

 

2. Suppressing the confidence interval

If confidence intervals are not needed, we can use option noci. For example

asdoc reg price mpg rep78 headroom trunk weight length turn , replace noci

 

3. Suppressing stars

Similarly, if we are not interested in reporting significance stars, we can use option nostars. For example, 

 

4. Setting custom significance level

The default significance levels for reporting stars are set at : *** for p-values <=0.01; ** for p-values <=0 .05, and * for p-values <=0.1. However, now we can set our own levels for statistical significance using option setstars. An example of setstars option looks like:

setstars(***@.01, **@.05, *@.1)

As we can see from the above line, setstars separates each argument by a comma. Each argument has three components. The first component is the symbol (in our case it is *) which will be reported for the given significance elve. The second component is the @ sign that connects the significance level with the symbol. And the third component is the value at which the significance level is set. So if we want to report stars such that

* for p-value .001 or less
** for p-value .01 or less
*** for p-value .05 or less

We shall write the option setstars as

setstars(*@.001, **@.01, ***@.05)

Continuing with our example, let us use the above option to report our defined level of stars.

asdoc reg price mpg rep78 headroom trunk weight length turn , replace setstars(*@.001, **@.01, ***@.05)

 

 


  • 0

Merging datasets in Stata on long strings and less precise matching criterion

Category:Uncategorized

The Problem

Consider the following two datasets where company names are not exactly the same in both the datasets, still we want to merge them using the name variable as the merging criterion. How that can be done in Stata.

name in dataset 1 name dataset 2 symbol data set2
The Pakistan General Insurance Co. Ltd. The Pakistan General Insurance PKGI
Security Leasing Corporation Ltd. Security Leasing Corporation L SLCL
NATIONAL BANK OF PAKISTAN NATIONAL BANK OF PAKISTAN NBP
THE BANK OF KHYBER THE BANK OF KHYBER BOK
THE BANK OF PUNJAB Ltd THE BANK OF PUNJAB BOP
SILKBANK LIMITED . SILKBANK LIMITED SILK
Awwal Modaraba Comay Awwal Modaraba AWWAL
B.F. Modaraba. B.F. Modaraba BFMOD

The above table shows that the company names not only differ in terms of different number of characters but also in terms of capitalization. However, there is a general patterns of similarity in the first few characters, starting from left to right. Given that, we can split the problem into two parts.

 

1.Extract similar characters

Extract the first few characters that are similar in both the dataset and merge the data using those similar characters. For example, in the second row of the above table, “The Pakistan General Insurance” part is similar in both the tables. If we count these characters, they are 30. So how exactly are we going to find the matching number of characters in each case? We shall not do that. Instead, we shall take the iterative path where:


1. we start with extracting the first n-number of characters from both the key variables in the two datasets and merge using the extracted (truncated) variables. If the merge succeeds, we shall save the merged data separately. Also, we shall delete, from the initial file, those records which were successfully merged, and further process those that did not merge. In this first step, we shall normally start with extracting a large number of characters, for example, up to 30 characters in the case of row 2 of the above table.

Please note that the relevant Stata function is substr() for extracting a given number of characters from a variable. We shall discuss it further as we proceed in this article.

2. In the next iteration, we shall further process those records which did not merge. This time we shall extract one character less than what we used in the preceeding step. The idea is that if two variables did not have the first 30 characters in common, they might have the first 29 characters in common. But this time, we need to be careful as we reduce the extraction of the initial number of characters, the chances of matching incorrect records increases. For example, consider the following two records where the first 12 characters are exactly the same in both the records i.e. “THE BANK OF “, therefore, if we merge the two data sets using only the first 12 characters, we shall incorrectly merge THE BANK OF KHYBER into THE BANK OF PUNJAB

THE BANK OF KHYBERTHE BANK OF PUNJAB

So using the first 29 characters of the two variables, we shall proceed to merge. Again, we shall retain the successfully merged records, append it to the already saved successful merged data, and delete it from the initial dataset, just we did in the first step above.

3. The iterative process will continue in the same fashion, and each time we need to pay more attention to the merged data to identify any incorrect merges as explained in step 2.

4. We shall stop at a point where the data seems to have a reasonable number of similar observations based on the initial characters.

 

2. Use lower case

The second part of the problem is the dissimilarity in the capitalization of the names. This part of the problem is easy to handle as we can use the function lower() to convert all the names to lower cases and then use it in the merging process.

 

An Example

Let’s use a simple example to implement what we have read so far. In this example, we are going to use the same data as given in the above table. From that data, we shall create two datasets. One of the dataset will remain in the Stata memory, we shall call it data_memory. The other data set will be saved to a file, we shall call it data_file. The data_file has two variables, name and symbol. We shall merge the data_memory into data_file using variable name as the merging criterion. To create the two dataset, we can copy and paste the following code to Stata do editor and run it.

* Copy the following code and run from Stata do editor

*--------------------------------------------------------------------------
* Create the data_memory
*--------------------------------------------------------------------------
clear
input str30 name str5 symbol
"The Pakistan General Insurance" "PKGI"
"Security Leasing Corporation L" "SLCL"
"NATIONAL BANK OF PAKISTAN" "NBP"
"THE BANK OF KHYBER" "BOK"
"THE BANK OF PUNJAB" "BOP"
"SILKBANK LIMITED" "SILK"
"Awwal Modaraba" "AWWAL"
"B.F. Modaraba" "BFMOD"
end
save data_memory, replace

*--------------------------------------------------------------------------
* Now to create the data_file
*--------------------------------------------------------------------------
clear
input str39 name
"The Pakistan General Insurance Co. Ltd."
"Security Leasing Corporation Ltd."
"NATIONAL BANK OF PAKISTAN"
"THE BANK OF KHYBER"
"THE BANK OF PUNJAB Ltd"
"SILKBANK LIMITED ."
"Awwal Modaraba Comay"
"B.F. Modaraba."
end

save data_file, replace

Now that we have created the two datasets, let’s start the process step by step. In the following code box, the code perform the first step as discussed above. I shall then explain what actually the code does.

use data_file, clear
gen name2=lower(substr(name,1,30))
clonevar name_old = name
save data_file, replace
use data_memory
gen name2=lower(substr(name,1,30))
merge 1:1 name2 using data_file
save temporary, replace
keep if _merge == 3
save matched, replace
list name name2 symbol
+----------------------------------------------------------------------+
name name2 |
---------------------------------------------------------------------- |
NATIONAL BANK OF PAKISTAN national bank of pakistan |
Security Leasing Corporation Ltd. security leasing corporation |
THE BANK OF KHYBER the bank of khyber |
The Pakistan General Insurance Co. Ltd. the pakistan general insurance
-----------------------------------------------------------------------


use temporary, clear
keep if _merge == 2
keep name
save remaining, replace
list

 +--------------------+
| name |
+--------------------+
| Awwal Modaraba |
| B.F. Modaraba |
| SILKBANK LIMITED |
| THE BANK OF PUNJAB |
+--------------------+

 

Explanation

In the above code block, we loaded the first dataset i.e. data_memory and created a truncated variable (name2) from the original variable using the following line of code. The first function in this line is lower() that converts all cases to lower cases. The second function is substr() that extracts bytes 1 to 30 from the string variable name. So this line will convert “Security Leasing Corporation Ltd.” to “security leasing corporation

gen name2=lower(substr(name,1,30))

After creating the name2 variable, we saved and replaced the data_file. We then loaded the other dataset, that is, data_memory and created a similar truncated variable as above, and then merged the two datasets using this truncated variable. The merged dataset has all useful information and we saved it in a temporary.dta file for further processing.

The first step in this processing is to isolate successful merges, that is flagged by the _merge == 3 code, therefore, we kept all such observations and saved them to matched.dta file. The list command shows that we have 4 successful merges in this go. We then reload the temporary.dta file to isolate those records that did not merge. Therefore, those records are saved to another file, which we named as remaining.dta.

In our next iteration, we shall take all of the above steps using the remaining.dta file, instead of data_file.dta file.

 
*-------------------------------------------------------------------------
* Iteration 2 - using first 29 characters
*-------------------------------------------------------------------------
use data_file, clear
drop name2
gen name2=lower(substr(name,1,29))
save data_file, replace
use remaining, clear
gen name2=lower(substr(name,1,29))
merge 1:1 name2 using data_file
save temporary, replace
keep if _merge == 3
save matched, replace
list name name2 symbol

 +--------------------------------------------------+
 | name name2 symbol |
 |--------------------------------------------------|
| Awwal Modaraba awwal modaraba AWWAL |
| B.F. Modaraba b.f. modaraba BFMOD |
| SILKBANK LIMITED silkbank limited SILK |
| THE BANK OF PUNJAB the bank of punjab BOP |
+--------------------------------------------------+

save remaining, replace
 

Is there a simple way to do this?

 

smerge Program

The above tutorial is best for a learning the merging process, however, it is too lengthy and is not optimal for quick applications. I have managed to get the code in a standard Stata ado program. I call this program as smerge, that is merge based on sub-string.

 

Installation

The program can be installed by typing the following line in Stata command window

net install smerge, from(http://fintechprofessor.com) replace

The syntax of the program is given below:

 smerge varname, use(file_name) [lower() higher()]

smerge is the program name, followed by the varname, that is the variable name which should exist in both the files, i.e. the file currently loaded file, i.e. the file in the memory of Stata and the file on disk. After the comma, the program options are listed. Option use(file_name) is a required option. In the brackets, we have to provide the file name with which we want to merge the data. It can be a full directory path with the file name or simply the file name if the file is in the current directory.

options enclosed in brackets [] show optional options in Stata. There are two optional options in smerge program. If these options are not specified, then the default values are used, which are 30 for the higher and 10 for the lower. The higher() option sets the initial value of the number of string bites to be extracted from the varname to make a truncated variable that will be used for the first iteration in the merge process. The initial value is decremented by one in the next iteration until it reaches the limit set by the option lower().

 

An Example using smerge

Let use the same dataset which we created above. We shall first load the file data_memory and merge it with the file data_file.dta. Both the datasets have the variable name.

use data_memory.dta, clear
smerge name, use(data_file) lower(10) higher(30)
list

. list
+---------------------------------------------------------+
| name symbol nstr |
|---------------------------------------------------------|
| The Pakistan General Insurance Co. Ltd. PKGI 30 |
| Security Leasing Corporation Ltd. SLCL 30 |
| NATIONAL BANK OF PAKISTAN NBP 30 |
| THE BANK OF KHYBER BOK 30 |
| THE BANK OF PUNJAB Ltd BOP 18 |
|---------------------------------------------------------|
| SILKBANK LIMITED . SILK 16 |
| Awwal Modaraba Comay AWWAL 14 |
| B.F. Modaraba. BFMOD 13 |
+---------------------------------------------------------+
 

Explanations

Read the second line in the above code block. We typed smerge and then the variable name. The name variable is the key variable that we are using as the merging criterion. Option use(data_file) tells smerge program to merge the current data in memeory with the file data_file. And option higher(30) and lower(10) sets the limits for sub-string extraction for creating the truncated variables that will be used for merging. And that’s it. We just need one line of code to do all the steps that we went through in the earlier tutorial.

The list command shows three variables, one of them is nstr. This variable records the number of strings from left to right of the varname. This number is recorded when the merge is successful. So the list command shows that the four records were merged in the first iteration i.e when we extracted the first 30 characters. The 5th record was merged when we extracted 18 characters i.e. The full entry is THE BANK OF PUNJAB Ltd, and the first 18 characters are THE BANK OF PUNJAB


  • 3

asdoc: Export Stata dta file to MS Word

Category:Uncategorized

Creating tables in Stata using asdoc is super-easy. In this short post, I’ll show how to use asdoc to export Stata data to MS word files. If you have not already installed asdoc, it can be installed by

ssc install asdoc

 

For exporting values from the data files, we can use the sub-command list of asdoc. We can also make the command conditional using the in and if qualifiers. In the following example, let us use the auto data set from the system folders

sysuse auto, clear
asdoc list price trunk mpg turn in 1/10 , replace

 

 

Explanation

In the above line of code, we wrote asdoc and the sub-command list. After that, we specified the names of the variables that we wanted to export to the MS word document. These variables included price trunk mpg turn. After that, we used the phrase in 1/10, that is the in qualifier to report observation 1 to 10. The option replace will replace any existing output file with the name Myfile.doc

 


  • 0

ASROL Version update: calculation of geometric mean and products in a rolling window and over groups in Stata

Category:Uncategorized

 

Description

asrol calculates descriptive statistics in a user’s defined rolling-window or over a grouping variable. asrol can efficiently handle all types of data structures such as data declared as time series or panel data, undeclared data, or data with duplicate values, missing values or data having time series gaps. asrol can be used for the calculation of a variety of statistics [see table of contents].

 

Installation

ssc install asrol

After installation, you can read the help file by typing:

help asrol

 

Options and defaults

Version 4.5.1 of asrol significantly improves the calculation of the product and the geometric mean.  Since both the statistics involve the multiplication of values in a given window, the presence of missing values and zeros present a challenge to getting the desired results. Following are the defaults in asrol to deal with missing values and zeros.

a. Missing values are ignored when calculating the product or the geometric mean of values.

b. Handling zeros in geometric mean: To be consistent with Stata’s default for geometric mean calculations, (see ameans), the default in asrol is to ignore zeros and negative numbers. So the geometric mean of 0,2,4,6 is 3.6342412, that is [2 * 4 * 6]^(1/3). And the geometric mean of 0,-2,4,6 is 4.8989795, that is [4 *6]^(1/2)

c. Handling zeros in products: Zeros are considered when calculating the product of values. So the product of 0,2,4,6 is 0

d. Option ignorezero: This option can be used to ignore zeros when calculating the product of values. Therefore, when the zero is ignored, the
product of 0,2,4,6 is 48

e. Option add(#) : This option adds a constant # to each values in the range before calculating the product or the geometric mean. Once the
required statistic is calculated, then the constant is substracted back. So using option add(1), the product of 0,.2,.4,.6 is 1.6880001 that is
[1+0 * 1+.2 * 1+.4 * 1+.6] – 1 and the geometric mean is .280434 is [(1+0 * 1+.2 * 1+.4 * 1+.6)^(1/4)] – 1.

 

Examples

Let us start with simple examples of calculating the geometric mean and products.  Our example data has stock prices, company identifiers (symbols) and time identifier (date) 

use http://fintechprofessor.com/stocks.dta, clear

* Generae numeric identifier for each firm
encode symbol, gen(id)

* Declear the data as panel data
tsset id date

* Create stock returns
gen returns = d.close/l.close

* Note the above formula for stock returns is analogous to
gen returns2 = (close - L.close) / L.close

 

 Geometric mean

 Now find geometric mean for stock returns, adding 1 before calculation and subtracting the same after calculation.  The calculations are made for each firm in a rolling window of 20 observations

bys id: asrol returns, stat(gmean) window(date 20) add(1)

 

Products – the case of cumulative returns

Since we find products of (1+returns) for finding cumulative returns over n-periods, we can use the product function of asrol [read this blog entry for more more details on simple and log returns

Cumulative n-period simple returns =(1+simple_r1) * (1+simple_r2) 
*(1+simple_r3)  ... (1+simple_rn)  - 1     --- (Eq. 1)

 

The asrol command for the 20-periods rolling window cumulative returns would be:

bys id: asrol returns, stat(product) window(date 20) add(1)

 

 

Option to ignore zeros

Option ignorezero or ig can be useful when we want to exclude zeros from the calculation of products. So let’s say we have the variable x that has values of 1, 2, 3, 0, and 5. Finding product of this variable will result in zeros. If there were circumstannce where we wish to find product of only non-zero values, the asrol command would be

asrol x, stat(product) ig
list

+--------------+
| x produc~x |
|--------------|
| 1 30 |
| 2 30 |
| 3 30 |
| 0 30 |
| 5 30 |
+--------------+

Without using the option ig, the product would be zero

asrol x, stat(product) gen(pro_withoutig)
. list
+-------------------------+
| x produc~x pro_wi~g |
|-------------------------|
| 1 30 0 |
| 2 30 0 |
| 3 30 0 |
| 0 30 0 |
| 5 30 0 |
+-------------------------+

 

A note on the methods used

Previous versions of asrol used log transformation of values for finding products and geometric mean. Version 4.5.1 onwards, the log transformation method is discontinued in the calculation of products and geometric means. The calculations now consider actual multiplications of the values in a given range. So the geometric mean is calculated as the nth root of the products of n numbers. 

or a set of numbers x1, x2, …, xn, the geometric mean is defined as

{\displaystyle \left(\prod _{i=1}^{n}x_{i}\right)^{\frac {1}{n}}={\sqrt[{n}]{x_{1}x_{2}\cdots x_{n}}}}

where the capital pi notation shows a series of multiplications.

Similarly,  the products are calculated as :

Product = x1 * x2  ... xn

 


  • 0

asreg: Get standard errors of the first stage regression of the Fama and MacBeth (1973) Procedure in Stata

Category:Uncategorized Tags : 

In the following example, we shall use asreg that can be installed from SSC by typing the following line in Stata command window

ssc install asreg

 

The problem

Let’s say that we wish to report different regression statistics from Fama and MacBeth (1973) regression such the standard errors of variables. Using the fmb option, asreg can efficiently estimate FMB regression. Further, it reports the regression coefficients of the first stage regression when option first is used with the option fmb.  However, it does not report other regression statistics. 

 

The solution

The good news is that we can still find different regression components using asreg. Since the first stage regression of the FMB procedure is the cross-sectional regression, we can use the bysort period prefix with asreg.

 

An example

Let us use the grunfeld data and estimate the FMB regression in the usual manner.

webuse grunfeld, clear
asreg invest mvalue kstock, fmb first

First stage Fama-McBeth regression results

  +------------------------------------------------------------+
  | _TimeVar   _obs       _R2   _b_mva~e   _b_kst~k      _Cons |
  |------------------------------------------------------------|
  |     1935     10   .865262    .102498   -.001995    .356033 |
  |     1936     10   .696394    .083707   -.053641    15.2189 |
  |     1937     10   .663763    .076514    .217722   -3.38647 |
  |     1938     10   .705577    .068018    .269115   -17.5819 |
  |     1939     10   .826602    .065522    .198665   -21.1542 |
  |     1940     10   .839255    .095399    .202291   -27.0471 |
  |     1941     10   .856215    .114764    .177465   -16.5195 |
  |     1942     10   .857307    .142825    .071024   -17.6183 |
  |     1943     10   .842064     .11861    .105412   -22.7638 |
  |     1944     10   .875515    .118164    .072207   -15.8281 |
  |     1945     10   .906797    .108471    .050221   -10.5197 |
  |     1946     10   .894752    .137948    .005413   -5.99066 |
  |     1947     10   .891239    .163927   -.003707   -3.73249 |
  |     1948     10   .788823    .178667   -.042556    8.53881 |
  |     1949     10   .863257    .161596   -.036965    5.17829 |
  |     1950     10   .857714    .176217   -.022096   -12.1747 |
  |     1951     10   .873773    .183141   -.112057    26.1382 |
  |     1952     10   .846122    .198921   -.067495    7.29284 |
  |     1953     10   .889261    .182674    .098753   -50.1525 |
  |     1954     10    .89845    .134512    .331375   -133.393 |
  |---------------------------------------------------------
Mean | 1944.5    10    .836907   .130605    .072958    -14.757 |
  +------------------------------------------------------------+
Fama-MacBeth (1973) Two-Step procedure           Number of obs     =       200
                                                 Num. time periods =        20
                                                 F(  2,    19)     =    195.04
                                                 Prob > F          =    0.0000
                                                 avg. R-squared    =    0.8369
------------------------------------------------------------------------------
             |            Fama-MacBeth
      invest |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      mvalue |   .1306047   .0093422    13.98   0.000     .1110512    .1501581
      kstock |   .0729575   .0277398     2.63   0.016     .0148975    .1310176
       _cons |  -14.75697   7.287669    -2.02   0.057    -30.01024     .496295
------------------------------------------------------------------------------

 

An alternate way to first-stage

bys year: asreg invest mvalue kstock, se
bys year: keep if _n == _N
list _*
+-------------------------------------------------------------------------------------------------------------+
| year _Nobs _R2 _adjR2 _b_mvalue _b_kstock _b_cons _se_mv~e _se_ks~k _se_cons |
|-------------------------------------------------------------------------------------------------------------|
| 1935 10 .86526202 .82676546 .10249786 -.00199479 .35603339 .0157931 .2148591 23.82794 |
| 1936 10 .69639369 .60964903 .08370736 -.05364126 15.218946 .0211982 .4125528 49.72796 |
| 1937 10 .6637627 .56769491 .0765138 .21772236 -3.3864706 .0218952 .4745161 62.14382 |
| 1938 10 .70557727 .62145649 .06801777 .26911462 -17.581903 .0220019 .2076121 33.62243 |
| 1939 10 .82660153 .77705911 .06552194 .19866456 -21.154227 .0131751 .1563955 29.10151 |
|-------------------------------------------------------------------------------------------------------------|
| 1940 10 .83925512 .79332801 .095399 .20229056 -27.047068 .0171077 .2206074 42.49812 |
| 1941 10 .85621485 .81513338 .11476375 .17746501 -16.519486 .0197202 .2338307 47.43406 |
| 1942 10 .85730699 .81653756 .14282513 .07102405 -17.618283 .0246973 .1966943 43.85369 |
| 1943 10 .84206394 .79693935 .11860951 .10541193 -22.763795 .0207092 .1887016 46.8604 |
| 1944 10 .87551498 .83994783 .11816422 .07220719 -15.828145 .0169881 .1537212 41.84578 |
|-------------------------------------------------------------------------------------------------------------|
| 1945 10 .90679731 .88016797 .1084709 .05022083 -10.519677 .0133214 .1254533 35.10524 |
| 1946 10 .89475165 .8646807 .13794817 .00541339 -5.9906571 .018637 .1600683 45.73243 |
| 1947 10 .89123943 .86016498 .16392696 -.00370721 -3.7324894 .0280743 .1285463 37.80575 |
| 1948 10 .7888235 .72848735 .1786673 -.04255555 8.5388099 .0463983 .1661775 52.39133 |
| 1949 10 .86325678 .82418728 .16159617 -.03696511 5.1782863 .0346516 .1268614 41.07802 |
|-------------------------------------------------------------------------------------------------------------|
| 1950 10 .85771384 .81706065 .17621675 -.02209565 -12.17468 .0393216 .1361792 46.6222 |
| 1951 10 .87377295 .83770808 .18314051 -.11205694 26.138157 .0358898 .1486738 53.00348 |
| 1952 10 .84612242 .80215739 .19892081 -.06749499 7.2928402 .052286 .1906835 67.84544 |
| 1953 10 .88926056 .85762072 .18267385 .09875335 -50.152546 .058579 .2164437 77.91569 |
| 1954 10 .89845005 .86943578 .13451162 .33137459 -133.39308 .0704524 .1932826 76.18067 |
+-------------------------------------------------------------------------------------------------------------+

 

Explanation

In the above lines of code, we estimated a yearly cross-sectional regression with the option se to report the standard errors. Then we retained just one observation per year and deleted duplicates. The results are the same as reported by the option first in the fmb regression, with the only difference that we have now additional regression statistics.


  • 4

Quick Table for Converting Different Dates to Stata Format

Category:Uncategorized

Daily Dates

Copying data from the internet, CSV files, or other sources into Stata will record the date as a string variable, shown with red color. Before we can use the Stata time-series or panel-data capabilities, we need to convert the string date to a Stata date. In the following table, the first column shows different date formats in which the date is already recorded and brought into Stata. To convert them into a Stata date, an example code is shown in the second column. Once the date is converted into a Stata readable format, we need to format the date so that the visual display of the date is human-readable. We can do that by using the %td format, for example, we can use the code format mydate %td

text Code Output
15/1/2015
gen mydate=date(text, "MDY")
15feb2015
15/1/2015
gen mydate=date(text, "MDY")
15feb2015
2015/1/15
gen mydate=date(text, "YMD")
15feb2015
201502
gen mydate=date(text, "MY")
1feb2015
1/15/08
gen mydate=date(text,"MDY",1999)
15jan1908
1/15/08
gen mydate=date(text,"MDY",2019)
15jan2008
1/15/51
gen mydate=date(text,"MDY",2000)
15jan1951
1/15/01
gen mydate=date(text,"MDY",2050)
15jan2001
1/15/00
gen mydate=date(text,"MDY",2050)
15jan2000
20060125
gen mydate=date(text, "YMD")
25jan2006
060125
gen mydate=date(text, "20YMD")
25jan2006


Example using some data

* Enter example data
clear
input str9 text
"15/1/2015"
end

* Now convert the variable text to Stata date
gen mydate=date(text, "DMY")

* Change the display format
format mydate %td

 

From daily to other frequencies

From daily to Code
weekly
gen weekly_date = wofd(daily_date)
Monthly
gen monthly_date = mofd(daily_date)
Quarterly
gen qyarterly_date = qofd(daily_date)
Yearly
gen year = year(daily_date)

 

 

Example using some data

* Enter example data
clear
input str9 text
"15/1/2015"
end

* Now convert the variable text to Stata date
gen daily_date=date(text, "DMY")
format daily_date %td


* Create a weekly date
gen weekly_date = wofd(daily_date)
format weekly_date %tw

* Create a monthly date
gen monthly_date = mofd(daily_date)
format monthly_date %tm

* Create a quarterly date
gen quarterly_date = qofd(daily_date)
format quarterly_date %tq

* Create a yearly date
gen year = year(daily_date)

 

 

From other frequencies to daily

If we already have dates in weekly, monthly, or quarterly frequencies, we can convert them back to daily dates. The second column in the following table provides an example of a given format in which the date is already recorded, and the third column presents the code which shall create a daily date. To see the codes in action, download this do file and execute. The file extension should be changed from doc to do after download. 

From  given_date Code
weekly
2018w46
gen daily_date = dofw(given_date)
Monthly
2018m11
gen daily_date = dofm(given_date)
Quarterly
2018q4
gen daily_date = dofq(given_date)
Yearly
2018
gen daily_date = dofy(given_date)

 

Complex Conversions

If we already have dates in weekly, monthly, or quarterly frequencies, we can convert them back to daily dates and then to other frequencies. The second column in the following table provides an example of a given format in which the date is already recorded, and the third column presents the code which shall convert the date to the desired frequency.  

From  given_date Code
Weekly to monthly
2018w46
gen monthly_date = dofm(dofw(given_date))
Monthly to weekly
2018m11
gen weekly_date = dofw(dofm(given_date))
Quarterly to monthly
2018q4
gen monthly_date = dofm(dofq(given_date))
Monthly to quarterly
2018m11
gen quarterly_date = qofd(dofm(given_date))
Weekly to quarterly
2018w46
gen quarterly_date = qofd(dofw(given_date))
Quarterly to Weekly
2018q4
gen weekly_date = dofw(dofq(given_date))

 

 

  • 9

asdoc version 2 : Summary of New features | export Stata output to MS Word

Category:Uncategorized

Version 2.0 of asdoc is here. This version brings several improvements, adds new features, and fixes minor bugs in the earlier version. Following is the summary of new features and updates.

 

Brief Introduction of asdoc

asdoc sends Stata output to Word / RTF format. asdoc creates high-quality, publication-ready tables from various Stata commands such as summarize, correlate, pwcorr, tab1, tab2, tabulate1, tabulate2, tabstat, ttest, regress, table, amean, proportions, means, and many more. Using asdoc is pretty easy. We need to just add asdoc as a prefix to Stata commands. asdoc has several built-in routines for dedicated calculations and making nicely formatted tables.

 

How to update

The program can be updated by using the following command from Stata command window

ssc install asdoc, replace

 

New Features in Version 2.0

1.  Wide regression tables

This is a new format in which regression tables can be reported. In this format, the variables are shown in columns and one regression is reported per row. Therefore, this type of regressions tables is ideal for portfolios, industries, years, etc. Here is one example of a wide regression table. asdoc allows a significant amount of customization for wide tables including asterisks for showing significance, reporting t-statistics and standard errors either below regression coefficients or sideways, controlling decimal points, reporting additional regression statistics such adjusted R2, RMSE, RSS, etc., adding multiple tables in the same file, and several other features. Read this post to know more about wide table format.

 

2. Allowing by-group regressions

Version 2.0 of asdoc provides the convenience of estimating regressions over groups and summarizing the regression estimates in nicely formatted tables. This feature follows the Stata default of bysort prefix. This feature works with all three types of regression tables of asdoc that include detailed regression tables, nested tables, and wide tables. In this blog post, I show some examples of by-group regressions.

 

3. Allowing by-group descriptive statistics

Using the bysort prefix with asdoc, we can now find default, detailed, and customized summary statistics over groups. Details related to this feature will be added later on in a blog post.

 

4. Option label with tabulate and regress commands

Option label can now be used with regression and tabulation commands. Using this option, asdoc will report variable labels instead of variable names. In case variable labels are empty, then the variable names are reported.

 

5. Developing tables row by row using option row

Option row is a new feature in version 2. Option row allows building a table row by row from text and statistics. In each run of asdoc with option row, a row is added to the output table. This is a useful feature when statistics are collected from different Stata commands to build customized tables. To know more about this option, read this blog post.

 

6.  Accumulate text or numbers with option accum

Option accum allows accumulating text or numbers in a global macro. Once accumulated, the contents of the macro can then be written to an output file using option row.

 

7. Saving files in different folders

One additional feature of version 2.0 is the ability to write new files or append to existing files in different folders.

 


  • 1

Plotting cumulative average abnormal (CAAR) on a graph in Stata

Category:Uncategorized

In Stata, we can use the two-way graph type for plotting abnormal returns or cumulative average abnormal returns against the days of the event window. Suppose that we have event window of 7 days, and have the following data

 

 
days caar1 caar2 caar3 caar4
-3 0 0 0 0
-2 -.0043456 -.0050911 .0000683 .0000504
-1 -.0034961 -.0023533 .0037439 .0042783
0 -.0034278 .0019828 .0090661 .0106628
1 .0016178 .0067894 .0131572 .0156011
2 .0039689 .0104367 .0190594 .0221428
3 .0040022 .0129478 .0218878 .0267722

to plot the first caar1, we shall type :

graph twoway line caar1 days, xline(0)

 

If we were to plot all caars, then

graph twoway line caar* days, xline(0)

 


  • 4

Rolling regressions, beta, t-statistics, and SE in Stata

Category:Uncategorized

asreg can easily estimate rolling regressions, betas, t-statistics and SE in Stata. To understand the syntax and basic use of asreg, you can watch this Youtube video. In this post, I show how to use asreg for reporting standard errors, fitted values, and t-statistics in a rolling window.

To install asreg, type the following on the Stata command window

ssc install asreg

 

Report standard errors and fitted values 

We shall use the grunfeld data set for our examples. Using a rolling window of 15 observations, let us fit a regression model where our dependent variable is invest and independent variables are mvalue and kstock. We shall estimate the rolling regression separately for each company, therefore, we shall use the prefix bys company : 

Please note that option se and fit are used for reporting standard errors and fitted values, respectively.

webuse grunfeld, clear

bys company: asreg invest mvalue kstock, wind(year 15) fit se

 

Find t-statistics in the rolling window

Once we have the standard errors and coefficients, we can generate t-statistics by dividing respective coefficients on their standard errors. Therefore, to find t-values for the variable mvalue and kstock, we can generate new variables:

gen t_mvalue = _b_mvalue / _se_mvalue

gen t_kstock = _b_kstock / _se_kstock