asdoc : Creating high quality tables of summary statistics in Stata

Ever wanted to create high-quality summary statistics with one click in Stata. asdoc creates excellent tables of summary statistics such as mean, standard deviation, minimum, maximum, etc. asdoc offers four different methods of creating tables of summary statistics. These are discussed below with examples and relevant options. To know about installation of the program and other feature, you can visit this blog post. 

Simple tables of summary statistics:
To create a simple table of summary statistics, we normally type summarize or sum command in Stata. To send output from sum command to a Word document, we shall type the following. A picture of the output file is also shown below.

sysuse auto
asdoc sum

Summary / descriptive statistics for selected variables 

asdoc sum price mpg rep78 headroom trunk

 

Summary / descriptive statistics with [if] [in] conditions

asdoc sum price mpg rep78 headroom trunk if price>4000

 

Reporting customized decimal points

asdoc sum, dec(2)

26 Comments

Nicola Deghaye MSc (Health Econ)

November 20, 2018at 12:34 pm

Hi
I am using Stata 14 and am battling with “asdoc” with tabstat.
Here is my code:

tabstat TotalLWD_2013, by (quintile) stat(N mean median sem) format

which produces:

quintile |         N      mean       p50  se(mean)
---------+----------------------------------------
       1 |   8018.00      1.44      0.00      0.13
       2 |   6467.00      1.85      0.00      0.16
       3 |   5324.00      3.45      0.00      0.25
       4 |   1807.00      6.08      0.00      0.65
       5 |   2103.00     11.61      0.00      0.71
---------+----------------------------------------
   Total |  23719.00      3.26      0.00      0.12
--------------------------------------------------

asdoc tabstat TotalLWD_2013, by (quintile) stat(N mean median semean) ///
save(prevalence by category raw data 2013.doc) replace  
 
or

asdoc tabstat TotalLWD_2013, by (quintile) stat(N mean median semean) format ///
save(prevalence by category raw data 2013.doc) replace  

produce:

Summary statistics 
  	 N	 Mean	 Median	 se(Mean)
TotalLW~2013	25180	3.203	0	.111

In other words, it is not picking up my “by” option – it is giving me a one-way tab instead of a table of statistics of disability prevalence by school category (quintile)
I have tried various combinations of spacing with the by(quintile), but it isn’t solving the problem.

Please help.

Best regards,

    Attaullah Shah

    November 20, 2018at 12:40 pm

    Seems that asdoc is working perfectly with both the commands. However, in your command, there is a space between by and the term (quintile)

    So, you can correct it by

    asdoc tabstat TotalLWD_2013, by(quintile) stat(N mean median semean) ///
    save(prevalence by category raw data 2013.doc) replace  

Faisal Khan

December 15, 2018at 1:57 pm

Dear Sir,
I appreciate your effort and supportive attitude towards the community. I am interested in finding out descriptive statistics of mutual fund category wise.
When I use asdoc command for multiple category and variables it works but the format of the table is not consistent. The format of a table for 1st category is fine but not for the second, similarly 3rd category fine but not 4th and so on.
The Command is
asdoc by category , sort : summarize CH LNTNA EXP FL DL b1 TURNs TURNp RU12 FFV FF1 FF6 FF12 NoFMs DY Beta LnRed, replace

category: I have encode fund category

    Attaullah Shah

    December 15, 2018at 2:01 pm

    Faisal Khan:
    Due to the specific command structure of asdoc, it accepts the by option in two flavors.
    The first is to use bys as a prefix. So the following command should work for you

    bys category :  asdoc  summarize CH LNTNA EXP FL DL b1 TURNs TURNp RU12 FFV FF1 FF6 FF12 NoFMs DY Beta LnRed, replace

    The second method is to use by as an option after using the comma. So the following should also work.

    asdoc  summarize CH LNTNA EXP FL DL b1 TURNs TURNp RU12 FFV FF1 FF6 FF12 NoFMs DY Beta LnRed, replace by(category)

    .
    Let me know if there is any problem with any of these commands.

Anum Ellahi Mazhar

January 10, 2019at 10:55 am

hi… i want to make specialised table to have frequency and percentage with label identification…for example for gender i want to show the female and male percentage and frequency

    Attaullah Shah

    January 10, 2019at 4:35 pm

    Anum Ellahi Mazhar
    Can you please email me the format in which you are asking for the results?

Arantza Ugidos

February 9, 2019at 8:24 am

Hello, my name is Arantza Ugidos and I have just seen your video about the use of asdoc. I have a question. I want to make a table of summary statistics using weights. I have realized that when I use “asdoc sum” the reported summary statistics are not taken into account the weights. I get the same results when I use the weight option as when I do not use it. I write:

asdoc sum $xlist [weight=pesoper],  replace label dec(2) ///
save(TablaSumStat.doc) title (Descriptive Statistics) 

Have you had the same problem or I am doing something wrong?

Thank you.

Best wishes,

    Attaullah Shah

    February 9, 2019at 8:28 am

    Arantza Ugidos
    Thanks for your email. asdoc does not support weights at the moment, but adding it to asdoc is on my card for the next update.

Farah Zamir

February 20, 2019at 1:55 pm

Dear Dr Shah

I would appreciate your efforts regarding asdoc. I am also trying to use this feature. However, when I use
asdoc tab N_Comb_code COUNTRY

to tabulate SIC codes with 9 countries, the table gets out of margins even in word. I tried changing the font to Courier new and 9….but nothing works. Can you please help me with this. Additionally how to add column headings?

    Attaullah Shah

    February 22, 2019at 8:09 am

    Farah Zamir
    When there are many categories of the given variables, the output will naturally be larger. Normally, it is a good practice to put the variable with many categories in rows. Therefore, if the output goes out of margins, you can swap the places of the two variables. So instead of

    asdoc tab N_Comb_code COUNTRY

    Try the followin

    gasdoc tab COUNTRY N_Comb_code 

    Further, I am going to release the new version of asdoc in the coming week, so do download the new version from ssc by typing

    ssc install asdoc, replace

    . This new version creates elegant tables from the tab and table commands.

Andrew (Public Health)

March 4, 2019at 7:19 pm

Dear Dr Shah,

Thank you very much for asdoc. It works great for me. Tables for conditional (fixed-effects) logistic regression look fantastic and are ready to be published. However, the table for mixed logit models is slightly different (I use the user-written mixlogit.ado). It creates coeff, se, z, p, and conf intervals just fine but doesn’t use labels instead of varnames (option label doesn’t seem to work for mixed logit models) and summary statistics such as AIC or BIC. Is there any workaround to get full statistics for mixed logit models?

Kind regards,
Andrew

Andrew (Public Health)

March 4, 2019at 7:25 pm

In Addition to my former message:

This works great with all statistics:
asdoc clogit choice v1-v12, gr(csets) save(clogit.doc) replace label

Here I get coefficients for mean and SD, but labels and fit criteria such as AIC, BIC are missing:

asdoc mixlogit choice, rand(v1-v12) gr(csets) id(individ)  save(mix.doc) replace label

    Attaullah Shah

    March 4, 2019at 7:37 pm

    Andrew
    asdoc tries to make the regression table from the matrix r(table), which is left behind by standard Stata regressions. The user-written package you have referred to "mixlogi.ado" does not mention in its help file that it stores results in r() or e() macros. Given that, asdoc tries to use its generic routine to make the table, therefore the standard asdoc’s options available with the regression commands do not work with the mentioned package.

Andrew

March 4, 2019at 7:45 pm

Dr. Shah,

thank you very much for the fast reply. I appreciate it. Then I will just add the missing statistics in this specific model manually. I can live with that. Anyway, asdoc is a great help in preparing my papers.

Kind regards,
Andrew

    Attaullah Shah

    March 5, 2019at 7:13 am

    Andrew
    Please do cite asdoc in your research.
    In-text citation
    Tables were created using asdoc, a Stata program written by Shah (2018).

    Bibliography
    Shah, A. (2018). ASDOC: Stata module to create high-quality tables in MS Word from Stata output. Statistical Software Components S458466, Boston College Department of Economics.

Frankline Onchiri

April 11, 2019at 5:11 am

Hello Prof. Shah,
Is it possible to include variable and value labels in the detailed Regression Table? This will improve on readability of the outputs.
Thank you and so you know, we really appreciate your responses and collegiality.

    Attaullah Shah

    April 11, 2019at 6:53 am

    Frankline Onchiri
    asdoc can report variable labels. For example,

    sysuse auto, clear
    asdoc reg price mpg trunk weight, label replace

    If you are referring to something different, then please guide me to an example and I shall try to explore the possibility of adding it to asdoc.

Frankline.

April 21, 2019at 7:27 am

Thank you so much for your response. I have been running conditional logistic regression model but the variable and value labels don’t print on the word document that asdoc creates.
Again, I appreciate your time.

Attaullah Shah

April 21, 2019at 8:56 am

Frankline
Can you please send me y our example data and the Stata code that you are using on the following email attaullah.shah@imsciences.edu.pk

Asim Jahangir

May 23, 2019at 9:06 pm

Dear Dr. Shah,

Appreciate your efforts for the research community and making asdoc available for us.
I need to create summary stats between groups and report t-test of significance if the values differ. Currently, I am using the following codes:

sysuse auto
orth_out price mpg, by(foreign) pcompare stars

which gives the following output in a excel file

   Domestic:      Foreign:  (1) vs. (~e:
                               _             _             _
        Price:mean      6072.423      6384.682         0.680
Mileage (mpg):mean        19.827        24.773         0.001

The command is great but it doesn’t give the flexibility to format the table or to introduce * (stars) of significance.

Another possibility is to use estpost and estout combination, like this:

sysuse auto
estpost ttest price mpg, by(foreign)
esttab ., wide

with the following output:
-----------------------------------------
                      (1)                
                                         
-----------------------------------------
price              -312.3         (-0.41)
mpg                -4.946***      (-3.63)
-----------------------------------------
N                      74                
-----------------------------------------

This command gives the output in a text or excel file, with stars. But, I am finding it hard to introduce the mean values of each variable by foreign (category) and format the headers of the table.

The question is, is there a way to use asdoc to make tables with means, mean difference and stars of significance?

Thanks and appreciate your support.

    Attaullah Shah

    May 23, 2019at 10:40 pm

    Asim Jahangir
    A similar question was asked on the Statalist. And Liu Qiang presented an excellent example using the asdoc’s row option. This option is used for making highly customized tables. See the Statalist discussion here.

Asim

May 23, 2019at 9:24 pm

Thanks

Asim Jahangir

May 24, 2019at 6:44 am

Thanks Dr. Shah, this helps a great deal.

I have to produce multiple sets of tables for different categories (using the same dependent variable) where some are binary categories and others have multiple categories. How would one go about it running multiple t-tests like this and append the results in a single or collated tables.

Consider the following code for instance, where one variable has multiple categories (it gives an error):

gen transmission = round(runiform())
label var transmission "Transmission: 0=manual 1=auto"
gen type = floor(runiform()*4)
label var type "Type: 0=sedan 1=mini-van 2=SUV 3=sports"
foreach var in foreign transmission type {
	foreach i in price mpg {	
		ttest `i', by (`var')
	}
}

Attaullah Shah

May 24, 2019at 10:20 pm

The error message you are getting belongs to Stata’s own limitation of allowing only two categories in ttest while using option by(). Therefore, asdoc cannot go beyond Stata’s capabilities. Specifically, the code chokes when it runs the ttest with the option by(type) variable which has three categories.

I am not sure what hypotheses are you testing, but you can estimate ttests using if qualifier with the type variable, adding the relevant categories in the command in turns, so for the first two categories of type, the command would be:

ttest price if inlist(type, 0,1), by(type )

and for the next two, it will be:

ttest price if inlist(type, 0,2), by(type )

and then:

ttest price if inlist(type, 1,2), by(type )

Maria

May 28, 2019at 12:46 am

Dear Prof. Shah,

thanks for your helpful asdoc command.

I have one question: I would like to do the following: asdoc tab2 state urban, cell

While I do have the percentages in the state output, they are not displayed in the Word file.
I saw the “nokey column/row replace” option, which works, but I would still prefer to have the equivalent to the “cell” option, meaning that only in the right corner of the graph I have 100%, to which the total-columns and total-rows sum up. Is this possible with asdoc?

Also, I was wondering whether options like putting the percentages in brackets were possible?

Thank you for any advice or comment!

Best
Maria

    Attaullah Shah

    May 28, 2019at 2:39 am

    Maria:
    It would be easier for me to reply if you provide an example dataset with the codes that you have used. You can send the data and codes to

    attaullah.shah@imsciences.edu.pk

Leave a Reply

13 − seven =