Copying a table from Stata's output

Sergiy Radyakin

In the following discussion http://www.stata.com/statalist/archive/2013-11/msg00308.html Elizabeth asks why copying a table does not work in Stata.

The problem that Elizabeth describes has been in Stata for ages, and it is not a problem or a bug at all, once it becomes clear, what Stata is actually doing. Stata does not have tables in output. Output is just a text buffer filled with some characters. There are no connections between them, there are only some attributes (color, plus a few others). This does not contradict the fact that the table that you see on the screen is produced from a neatly organized matrix: once it is outputted it is serialized into a sequence of letters and lines and gets completely disconnected from any original data structure.

However, Stata provides an option to "Copy table as HTML" from the context menu. This is similar to the functionality of e.g. SPSS, with a big difference, that SPSS genuinely stores tables in the output, and Stata does not. Here is an example from a free alternative to SPSS, a statistical package PSPP, though PSPP does not allow to do anything special with the table, while SPSS itself does:

The only choice that Stata has is to parse the text and try to reconstruct the table from it. There are several assumptions that the developers have to make in order to achieve this. These assumptions are not documented (or I didn't come across them). But we can try to reconstruct these assumptions from the observed Stata's behavior.

Consider the following example:


display "12345 67890{break}1 2"

With the output:

12345 67890
1 2

If one "copies the table" now, contrary to the expectation, numbers 1 and 2 will end up in the same cell. The problem is of course because the value 12345 above it was wide enough to make Stata think that the first column is at least 5 characters wide. So despite there is a space separating 1 and 2 - they will end up as a one value in one cell.

So what do we do? We format the output to match Stata's expectations:


mata printf("%6.0f %6.0f{break}%6.0f %6.0f", 12345, 67890, 1, 2)

with the output:

12345 67890
    1     2

which can now be copied easily to Excel.

Obviously we don't have to use Mata: any method is fine as long as the values end up organized into aligned columns. Here, Stata's display command produces the same output with the same consequences.


display %6.0f 12345 %6.0f 67890 _n %6.0f 1 %6.0f 2

Stata itself now (since 2007, approx. v12) uses an [undocumented] tables formatting engine (classes), which guarantees that each column is having a fixed width. This takes almost all the problem away. However the user-written programs may produce jagged/non-conforming output, which may not be acceptable to the Stata's table parsing engine. If the program is still supported, request the author to make sure that any tables that are produced in output can be later correctly recognized as such by Stata. Or if the program is your own, make sure you format the output properly in your code.

Looking at the impressive amount of work done recently to revise the output of tables in Stata I would not be surprized if Stata 14 comes out equipped with an extremely powerful table output facility. Or is it only my wish?