Dear James and Bill, 
thank you very much for your assistance and the detailed troubleshooting guide.

Here are some specifics about the data format that would not violate confidentiality:

00: 114 (dataset specification)
01: 02  (LOHI)
02: 01  (always 1)
03: 00  (always 0) [manual used to say it is ignored, but it is not, it must be zero]
04-05:  V (number of variables) correspond exactly to what I see in Stata 12 and to the number
        of %fff formats defined in the header.
06-09:  N (number of observations) correspond exactly to what I see in Stata 12 when I say 
        describe, and when I say "describe using datafile.dta"

Having typed -creturn list- I further look at the description of the current dataset. As I 
already mentioned, there is no timestamp recorded (in the file the corresponding bytes are 
filled with zeroes). In creturn the data width W is reported alongside with values V and N. 
Multiplying W by N I get a value which is roughly 10MB or 2.5 larger than the file size on 
disk. Since file on the disk must also contain header and variable and value labels, (and 
hence be necessarily larger), I think it is possible to conclude that the file is indeed 
truncated, and that Stata 13.1 is actually reacting to an error that previous Statas used 
to pass, which is a good thing!!

The next check I am doing is to see why my SaveTo9 worked? It worked because it converts 
the header information only, since the body of data is identical in 9 and 10 formats, so it 
is copied without any modification as-is. Hence if it was broken in the original file, it 
remains broken in the converted file. Again Stata 9-12 can open this converted file, Stata 
13.1 refuses with an error message.

Next, I check what happens to the data in the tail of the dataset that appears to be 
missing. To do that I open the dataset in ADePT which uses a totally independent code for 
parsing Stata datasets. It opens the data, but only L observations, where L is roughly 9/25 
of N, (corresponds to the ratio of actual size of data to total expected size of data based 
on W and N). Hence it only shows the surviving observations (though it didn't report that 
it is smaller than the declared, which is illegal for Stata files, though valid for SPSS 
files, where N may be e.g. -1).

Next, I look into Stata.The dataset appears to be totally valid in all respects. There is 
no garbage. None. There are no misspelled strings, invalid (non-printable, non-ASCII) 
characters, etc.There are no wild numbers like 1.72634e209 or even a single negative. The 
data looks completely 'plausible'. There are however two clues: 

1) dataset is expected to have a unique ID in one of the variables. This variable is not 
   unique.
2) value labels are completely missing from data (-label list-), although declared (reported 
   by -describe-).

This reminded me an earlier bug report to StataCorp from ~ Aug02, 2011. Which 
had a similar, but more obvious situation, where a dataset cut to fit into 1.44MB floppies 
was read by the user piece-by-piece, and obviously wrong.

I then investigate how comes the data in the trailing observations is 'plausible'. I run 
-duplicates report-. Besides the unique observations, I get also:


. duplicates report

Duplicates in terms of all variables

--------------------------------------
   copies | observations       surplus
----------+---------------------------
        1 |          NNN             0
       50 |           N1            N2
       51 |           N3            N4

(this is probably excessive anonymisation, but I believe the numbers themselves are not 
that important here, if I get ok to reveal them, I will replace them later).


From this I infer that Stata started to reuse the last few (perhaps 50+51=101?) observations 
to fill the missing area of the dataset. This is what remained unknown to me in 2011 (or 
perhaps in 2011 Stata behaved differently? possible).


Here are some answers to Bill's questions:

Questions shortened somewhere here. Full text is here.

============

How many observations does the old Stata report?  It needs to match or the dataset is corrupted.
Reported number of observations matches exactly. It seems that whatever is declared in the file 
is somehow "set up" at the load time, and later everything will report this set up value.

============

Now, look at the last observation.  Type,

        . list in l

In theory, it makes no difference whether Sergiy does this with an OLD
Stata or Stata 13.  If I were Sergiy, I'd do it both ways just for my
own peace of mind.

I can only do this in the Stata 12, since Stata 13 refuses to open data. In Stata 12 I saw a 
seemingly normal observation. Nothing suspicious about it.

============

...last observation...
last observation looks good (plausible) in older Stata

============

...resaving from Stata 12...
works, creates a file 2.5 times larger than the original file

============

...last byte of file...
the file ends with a 00 (zero byte)

============

Possible scenario (guess)

Stata is using pointers to observations to improve sorting performance. 
Somehow these pointers are confused during data load when there is no data causing Stata to reuse 
previous observations. This probably occurs only during load time, since when the data is already
in the memory, replace x=1 in L causes only 1 change and it occurs only in the last observation.

Overall I think Stata 13.1 is doing better job properly reacting to the dataset that is corrupted,
then earlier versions of Stata.
 
At this time I am still puzzled by the absense of the timestamp. If the dataset was simply truncated 
length-wise, the timestamp should survive. Only the owner of the dataset may know more on it's
adventures. If I get this information I will post back to the list.

I also don't understand why, as Bill writes, "These are old files and so Stata 13 is more limited 
on the kinds of problems it can detect...". Stata 13 could have inherited the code from earlier
versions, but perhaps I overlooked something. But I am totally ok with it being more demanding now. 
It can actually prevent serious errors, like in this case where the data looked plausible.