manu@... wrote:
> Following up this:
> http://www.h-online.com/open/Ext4-data-loss-explanations-and-workarounds
> --/news/112892
>
> It seems ext4fs people do not assume that closing a file will make the
> data flushed to disk. It is claimed fsync(2) should be called first.
That's not new. FreeBSD's "soft updates" feature behaves
similarly: Files created within about 30 seconds before a
crash (e.g. power failure) can be lost, unless you forcibly
sync. That's because the meta data updates are re-ordered
and delayed. This is a feature, not a bug.
> The standards say "Any unwritten buffered data for the stream shall be
> written to the file" on a fclose(3) call:
> http://www.opengroup.org/onlinepubs/000095399/functions/fclose.html
Careful: Buffering occurs on many different layers. The
stdio library has buffers, the file system has buffers,
the operating system usually has a VM buffer cache, and
finally the disk controllers and disk drives have caches.
The "unbuffered data" mentioned in the fclose(3) function
refers to stdio buffers only. It does *not* guarantee that
any data hits the physical disks. In theory, stdio is
agnostic to the actual file system.
So, the behaviour of ext4fs (and many other file systems
that do the same) is not a standard violation, as far as I
can tell. I haven't looked at the actual code, though;
there might be other bugs in ext4fs. :-)
> Request for comments: do we need to modify our Fclose() macro so that it
> calls fsync() before fclose() on Linux? Is it useful? Can it harm?
The correct sequence would be to use fflush(3) first, then
fsync(2), then fclose(3). If you don't use stdio functions,
then fsync(2) before close(2) is sufficient.
It doesn't harm, except that it might have a small impact
on performance, because it prevents the operating system
from optimizing the time and order of syncing dirty FS
buffers to the physical disks. But I think this effect
is negligible in the case of milter-greylist.
Of course, if you want to be safe, you must check the return
value from fsync(2) (as all other I/O functions).
Note: If you create a temporary file and move it over
the previous file using rename(2), then you must also use
fsync(2) after the rename, in order to make sure that the
directory meta data was updated on the disk. Otherwise
you might still have the old file after a crash.
Of course, the question is whether milter-greylist really
has to do all of that. It is not necessarily critical if it
comes up with tuple list that is not completely up-to-date
after a crash. How often do your mail servers crash anyway?
Some people might have servers that are so busy that the
extra performance impact from a forced sync might be too
much.
Maybe it would be best to make it an option, so everybody
is happy. Note that most MTAs have such options, too, for
example sendmail calls it "SuperSafe" (by default it's on).
Best regards
Oliver
--
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606, Gesch\ufffdftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M\ufffdn-
chen, HRB 125758, Gesch\ufffdftsf\ufffdhrer: Maik Bachmann, Olaf Erb, Ralf Gebhart
FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd
In my experience the term "transparent proxy" is an oxymoron (like jumbo
shrimp). "Transparent" proxies seem to vary from the distortions of a
funhouse mirror to barely translucent. I really, really dislike them
when trying to figure out the corrective lenses needed with each of them.
-- R. Kevin Oberman, Network Engineer