Forum: Vertrauen

Committing file changes

From MIKE RUSKAI@1:3603/140 to CORIDON HENSHAW on Sun Aug 13 10:49:00 2000

Some senseless babbling from Coridon Henshaw to All
on 08-11-00 23:38 about Committing file changes...

What's the proper way to 'checkpoint' an open file so as to ensure
that the file's control structures are consistant on disk? I would
have thought that calling fflush() after every file write would be sufficient, but a recent trap proved that calling fflush() after file writes was no protection against CHKDSK truncating the file well before the last write. I suppose I could close and reopen the file after
every update, but I was hoping to find a more elegant solution. Any ideas?

I had thought that fflush() would call DosResetBuffer(), but it would seem
that at least some compilers only flush the CRT's buffers.

So, what you'd need to do is get an OS/2 handle to the file, and use DosResetBuffer() directly. If you can't get a handle, you can also use
that API to flush all open files for the current process.

That should solve your problem.

Mike Ruskai
thannymeister@yahoo.com

... Any problem can be solved by shooting the right person.

___ Blue Wave/QWK v2.20
--- Platinum Xpress/Win/Wildcat5! v3.0pr3
* Origin: Get all your fido mail here.. www.docsplace.org (1:3603/140)

From Herbert Rosenau@2:2476/493 to Coridon Henshaw on Sun Aug 13 12:41:35 2000

Am 11.08.00 23:38 schrieb Coridon Henshaw

What's the proper way to 'checkpoint' an open file so as to ensure that

the

file's control structures are consistant on disk? I would have thought that calling fflush() after every file write would be sufficient, but a recent trap proved that calling fflush() after file writes was no protection against CHKDSK truncating the file well before the last write. I suppose I could close and reopen the file after every update, but I was hoping to find a more elegant solution. Any ideas?

Don't use the C runtime! Use DosOpen(....OPEN_FLAGS_WRITE_THROUGH....) and other Dos... APIs instead.

The C runtime can't handle OS/2 specific flags.

fflush() does nothing than flush the runtime buffers - not the filesystem/device ones. This means the runtime doen#t always write directly to disk. And if it has to do so (eg. fflush() the driver itself doen't it because the filesystem cache (cache.exe/diskcache) doesn't it.

If a file is closed the filesystem my delay the last write until the disk is idle for a time.

The only way YOU can define exactly that the file is written directly to disk is using DosOpen and set the flag. This will disable the cache for that given file and all DosWrite goes directly on disk.

But absolutly nothing will get you sure the file is clean if a unpropper reboot
occures. The only you can do is to hold the window small - AND make a clean shutdown.

--- Sqed/32 1.15/development 1122:
* Origin: Lieber arm und gesund, als reich und beim Bund (2:2476/493)

From mark lewis@1:3634/12 to Coridon Henshaw on Sun Aug 13 03:05:54 2000

What's the proper way to 'checkpoint' an open file so as to
ensure that the file's control structures are consistant on
disk? I would have thought that calling fflush() after every
file write would be sufficient,

is there a COMMIT instruction?

but a recent trap proved that
calling fflush() after file writes was no protection against
CHKDSK truncating the file well before the last write. I
suppose I could close and reopen the file after every update,
but I was hoping to find a more elegant solution. Any ideas?

sounds like it was caught up in the cache subsystem and not written to disk in time...

)\/(ark

* Origin: (1:3634/12)

From Coridon Henshaw@1:250/820 to All on Fri Aug 11 16:38:06 2000

What's the proper way to 'checkpoint' an open file so as to ensure that the file's control structures are consistant on disk? I would have thought that calling fflush() after every file write would be sufficient, but a recent trap proved that calling fflush() after file writes was no protection against CHKDSK
truncating the file well before the last write. I suppose I could close and reopen the file after every update, but I was hoping to find a more elegant solution. Any ideas?

--- GoldED/2 3.0.1
* Origin: Life sucks and then you croak. (1:250/820)

From David Noon@2:257/609.5 to Coridon Henshaw on Mon Aug 14 13:23:44 2000

Hi Coridon,

Replying to a message of Coridon Henshaw to All:

What's the proper way to 'checkpoint' an open file so as to ensure
that the file's control structures are consistant on disk? I would
have thought that calling fflush() after every file write would be sufficient, but a recent trap proved that calling fflush() after file writes was no protection against CHKDSK truncating the file well
before the last write. I suppose I could close and reopen the file
after every update, but I was hoping to find a more elegant solution.

How low do you want to go?

Firstly, you should not be doing buffered I/O if your updates must be committed
immediately, so you should not use fopen() and fwrite() without a setbuf() call
to suppress buffer allocation. Better yet, you should consider using open() and
write() instead, and use the UNIX-like unbuffered I/O routines. If you want to be a real fundamentalist, you should use DosOpen() and DosWrite() without risking the CRTL tampering with your data flow.

Moreover, if your data resources are critically important then you should be handling any traps that occur in your program and cleaning up the critical data
resources in an orderly manner. This is far and away the most professional approach to the situation. About the only things you can't handle are kernel level traps and power outages.

In your situation, I would have used the second facility before considering any
intermediate commits.

Another consideration is choice of language. The more robust languages have RTL's that perform a formal close on all files, even when the application code fails to handle an error. This is because their RTL's catch all unhandled application errors, and they keep track of all files opened through the RTL. Hence, either PL/I or COBOL would have been a far better choice of language than C/C++, if you are not rolling your own trap handler. The ultimate moral is: code your own trap handler, or use a language that provides one built-in.

Regards

Dave
<Team PL/I>

--- FleetStreet 1.25.1
* Origin: My other computer is an IBM S/390 (2:257/609.5)

From Coridon Henshaw@1:250/820 to Herbert Rosenau on Mon Aug 14 05:54:14 2000

On Sunday August 13 2000 at 19:41, Herbert Rosenau wrote to Coridon Henshaw:

Don't use the C runtime! Use DosOpen(....OPEN_FLAGS_WRITE_THROUGH....) and other Dos... APIs instead.

Well, erm, yes, but that would mean re-inventing much of the C filesystem RTL on every platform I intend to port my project to. Following Mike Billow's suggestion to call DosResetBuffer (albeit through a small wrapper) far better suited to the problem at hand.

--- GoldED/2 3.0.1
* Origin: Life sucks and then you croak. (1:250/820)

From Coridon Henshaw@1:250/820 to MIKE RUSKAI on Mon Aug 14 07:19:02 2000

On Sunday August 13 2000 at 17:49, MIKE RUSKAI wrote to CORIDON HENSHAW:

So, what you'd need to do is get an OS/2 handle to the file, and use DosResetBuffer() directly. If you can't get a handle, you can also
use that API to flush all open files for the current process.

This sounds like it'd work, although I'm not about to push the Little Red Button to test the fix. :-)

Thanks.

--- GoldED/2 3.0.1
* Origin: Life sucks and then you croak. (1:250/820)

From Coridon Henshaw@1:250/820 to David Noon on Mon Aug 14 16:33:20 2000

On Monday August 14 2028 at 20:23, David Noon wrote to Coridon Henshaw:

How low do you want to go?

I'm building an open-source databasing offline Usenet news system, basically along the lines of standard Fidonet message tossers and readers, except designed from the ground up for Usenet news. As I intend the system to be portable, I'd like to keep the number of platform-specific API calls to an absolute minimum.

Incidentally, if anyone would like to take a look at an alpha copy of NewsDB, or would like to contribute to the project, drop me a message.

Firstly, you should not be doing buffered I/O if your updates must be committed immediately, so you should not use fopen() and fwrite() without

a

setbuf() call to suppress buffer allocation. Better yet, you should consider using open() and write() instead, and use the UNIX-like unbuffered I/O routines.

I'm concerned that disabling buffering entirely is going to hurt performance very badly as my application does lots of short (4 to 256 byte) IO calls. Relying on the disk cache to handle this kind of load seems a bit wasteful.

Moreover, if your data resources are critically important then you should be handling any traps that occur in your program and cleaning up the critical data resources in an orderly manner. This is far and away the

most

professional approach to the situation. About the only things you can't handle are kernel level traps and power outages.

The problem I ran into was that the kernel trapped (for reasons unrelated to this project) a few hours after I wrote an article into the article database. Since database was still in open (I leave the article reader running 24x7), the
file system structures were inconsistant enough that CHKDSK truncated the database well before its proper end point. As you say, catching exceptions wouldn't help much here.

My database format and engine implementations are robust enough to cope with applications dying unexpectedly without finishing write operations; they're not
robust enough to handle boot-up CHKDSK removing 80Kb of data from the end of a 100Kb file.

In your situation, I would have used the second facility before considering any intermediate commits.

It's not intermediate commits I need: what I need is some way to flush out write operations made to files which might be open for days or weeks at a time.

--- GoldED/2 3.0.1
* Origin: Life sucks and then you croak. (1:250/820)

From David Noon@2:257/609.5 to Coridon Henshaw on Sun Aug 20 01:00:14 2000

Hi Coridon,

Replying to a message of Coridon Henshaw to David Noon:

I'm building an open-source databasing offline Usenet news system, basically along the lines of standard Fidonet message tossers and
readers, except designed from the ground up for Usenet news. As I
intend the system to be portable, I'd like to keep the number of platform-specific API calls to an absolute minimum.

Thats poses some difficulties. A combination of safety, performance and platform-independence is a big ask. I would tend to compromise that last one before I compromised the first two.

Firstly, you should not be doing buffered I/O if your updates must be
committed immediately, so you should not use fopen() and fwrite()
without a setbuf() call to suppress buffer allocation. Better yet,
you should consider using open() and write() instead, and use the
UNIX-like unbuffered I/O routines.

I'm concerned that disabling buffering entirely is going to hurt performance very badly as my application does lots of short (4 to 256 byte) IO calls. Relying on the disk cache to handle this kind of
load seems a bit wasteful.

Since 4-to-256 bytes does not constitute a typical Usenet article, those would not be your logical syncpoints. You should be physically writing the data to disk at your syncpoints and only at your syncpoints.

Moreover, if your data resources are critically important then you
should be handling any traps that occur in your program and cleaning
up the critical data resources in an orderly manner. This is far and
away the most professional approach to the situation. About the only
things you can't handle are kernel level traps and power outages.

The problem I ran into was that the kernel trapped (for reasons
unrelated to this project) a few hours after I wrote an article into
the article database. Since database was still in open (I leave the article reader running 24x7), the file system structures were
inconsistant enough that CHKDSK truncated the database well before
its proper end point. As you say, catching exceptions wouldn't help
much here.

The flip side is that kernel traps are far less frequent than application traps, especially during development of the application. If your data integrity
is critical you should not only be handling any exceptions that arise, but you should be rolling back to your most recent syncpoint when an error does arise.

My database format and engine implementations are robust enough to
cope with applications dying unexpectedly without finishing write operations; they're not robust enough to handle boot-up CHKDSK
removing 80Kb of data from the end of a 100Kb file.

So you do have a syncpoint architecture, then?

In your situation, I would have used the second facility before
considering any intermediate commits.

It's not intermediate commits I need: what I need is some way to flush
out write operations made to files which might be open for days or
weeks at a time.

That's what an intermediate commit is.

The way industrial strength database management systems work [since at least the days of IMS/360, over 30 years ago] is that an application would have defined within it points in its execution where a logical unit of work was complete and the state of the data on disk should by synchronized with the state of the data in memory; this is how the term "syncpoint" arose, and the processing between syncpoints became known as a transaction. The process of writing the changes in data to disk became known as commiting the changes. The SQL statement that performs this operation under DB2, Oracle, Sybase and other RDBMS's is COMMIT.

These RDBMS's also have another statement, coded as ROLLBACK. This backs out a partially complete unit of work when an error condition has arisen. The upshot is that the content of the database on disk can be assured to conform to the data model the application is suposed to support. It does not mean that every byte of input has been captured; it means, instead, that the data structures on
disk are consistent with some design.

This seems to me to be the type of activity you really want to perform. One of your problems is that your input stream is not persistent, as it would be a socket connected to a NNTP server [if I read your design correctly, and assume you are coding from the ground up]. This means that you need to be able to restart a failed instance of the application, resuming from its most recent succesful syncpoint. The usual method to deal with this is to use a log or journal file that keeps track of "in flight" transactions; the journal is where
your I/O remains unbuffered. If your NNTP server allows you to re-fetch articles -- and most do -- you can keep your journal in RAM or on a RAMDISK; this prevents performance hits for doing short I/O's.

This design and implementation seem like a lot of work, and I suppose they are.
But some old timers were doing this on machines with only 128KiB of RAM when I was in high school, so a modern PC should handle it easily. To save yourself a lot of coding, you might care to use a commercial DBMS; a copy of DB2 UDB Personal Developer Edition can be had free for the download, or on CD for the price of the medium and shipping. Start at:
http://www.software.ibm.com/data/db2/
and follow the links to the download areas, or ask Indelible Blue about CD prices.

Using a multi-platform commercial product will provide you with platform independence, as well as safety. It is the simplest and most robust approach unless you are prepared either to do a lot of coding or compromise on the safety of your application.

Regards

Dave
<Team PL/I>

--- FleetStreet 1.25.1
* Origin: My other computer is an IBM S/390 (2:257/609.5)

From David Noon@2:257/609.5 to George White on Sun Aug 20 01:58:00 2000

Hi George,

Replying to a message of George White to Coridon Henshaw:

It's not intermediate commits I need: what I need is some way to
flush out write operations made to files which might be open for
days or weeks at a time

The only reliable way I know of is _NOT_ to keep the files open but to open and close them as needed. It is the _only_ way I know which is guaranteed to update the directory information (Inode under *NIX) so
that a chkdsk won't cause you that sort of grief. In a similar
situation I ended up opening and closing the file during normal
operation to ensure the on-disk information and structures were
updated. Originally I opened the file on start-up and kept it open.

This is true when one is keeping things simple, such as using sequential file structures.

A genuine database [and that is what Coridon claims he is coding] does not restrict itself to simple file structures. The usual approach is to allocate and pre-format a suitably large area of disk, known as a tablespace in DB2, and
then maintain database-specific structural data within that. The pre-format operation finishes by closing the physical file, thus ensuring the underlying file system has recorded the number and size of all disk extents allocated to the file. The DBMS is then free to "suballocate" the disk space as and how it sees fit. It also takes on the responsibility to ensure the consistency of the database's content.

We will see how Coridon implements such a database system.

Regards

Dave
<Team PL/I>

--- FleetStreet 1.25.1
* Origin: My other computer is an IBM S/390 (2:257/609.5)

From George White@2:257/609.6 to Coridon Henshaw on Sat Aug 19 14:59:07 2000

Hi Coridon,

On 14-Aug-00, Coridon Henshaw wrote to David Noon:

<snip>

Moreover, if your data resources are critically important then
you should be handling any traps that occur in your program and
cleaning up the critical data resources in an orderly manner.
This is far and away the most professional approach to the
situation. About the only things you can't handle are kernel
level traps and power outages.

The problem I ran into was that the kernel trapped (for reasons
unrelated to this project) a few hours after I wrote an article
into the article database. Since database was still in open (I
leave the article reader running 24x7), the file system structures
were inconsistant enough that CHKDSK truncated the database well
before its proper end point. As you say, catching exceptions
wouldn't help much here

My database format and engine implementations are robust enough to
cope with applications dying unexpectedly without finishing write operations; they're not robust enough to handle boot-up CHKDSK
removing 80Kb of data from the end of a 100Kb file

In your situation, I would have used the second facility before
considering any intermediate commits.

It's not intermediate commits I need: what I need is some way to
flush out write operations made to files which might be open for
days or weeks at a time

The only reliable way I know of is _NOT_ to keep the files open but to
open and close them as needed. It is the _only_ way I know which is
guaranteed to update the directory information (Inode under *NIX) so
that a chkdsk won't cause you that sort of grief. In a similar
situation I ended up opening and closing the file during normal
operation to ensure the on-disk information and structures were
updated. Originally I opened the file on start-up and kept it open.

George

--- Terminate 5.00/Pro
* Origin: A country point under OS/2 (2:257/609.6)

From George White@2:257/609.6 to David Noon on Mon Aug 21 01:09:07 2000

Hi David,

On 20-Aug-00, David Noon wrote to George White:

Replying to a message of George White to Coridon Henshaw:

It's not intermediate commits I need: what I need is some way to
flush out write operations made to files which might be open for
days or weeks at a time

The only reliable way I know of is _NOT_ to keep the files open
but to open and close them as needed. It is the _only_ way I know
which is guaranteed to update the directory information (Inode
under *NIX) so that a chkdsk won't cause you that sort of grief.
In a similar situation I ended up opening and closing the file
during normal operation to ensure the on-disk information and
structures were updated. Originally I opened the file on start-up
and kept it open.

This is true when one is keeping things simple, such as using
sequential file structures

Which is how Coridon appears to be doing things at present.

A genuine database [and that is what Coridon claims he is coding]
does not restrict itself to simple file structures. The usual
approach is to allocate and pre-format a suitably large area of
disk, known as a tablespace in DB2, and then maintain
database-specific structural data within that. The pre-format
operation finishes by closing the physical file, thus ensuring the underlying file system has recorded the number and size of all
disk extents allocated to the file. The DBMS is then free to
"suballocate" the disk space as and how it sees fit. It also takes
on the responsibility to ensure the consistency of the database's
content

From Coridon's description of his problem after the kernel trap, he is
not working that way, but adding variable length records to the file.
That of course means that the normal file operations can leave the
file in an inconsistant state. Certainly pre-allocating the data space
and working within it means that the file should never have to be
closed. In my experience some of the file caching on the PC platform
does not seem to handle the situation where a particular part of the
data structures (sector or cluster in the underlying file system) is
repeatedly written to and read from with the file kept open, opening
and closing the file seems to overcome the problem. I've never put in
the work to confirm this worry, just found a way to get reliable
operation and got on with codeing other things - I didn't have the
time when it arose and now the project is history I don't have any
inclination to look into it...

We will see how Coridon implements such a database system.

Like you, I'm watching with interest.

George

--- Terminate 5.00/Pro
* Origin: A country point under OS/2 (2:257/609.6)

From Coridon Henshaw@1:250/820 to David Noon on Mon Aug 21 17:09:54 2000

On Sunday August 20 2000 at 08:00, David Noon wrote to Coridon Henshaw:

Since 4-to-256 bytes does not constitute a typical Usenet article, those would not be your logical syncpoints. You should be physically writing the data to disk at your syncpoints and only at your syncpoints.

I break up articles into 251 byte chunks and write the chunks as a linked list.
Since the database will reuse article chunks which have been freed as a result
of article expiry, the article linked lists need not be sequential. As such, when the DB engine writes an article, it writes 251 bytes, reads and updates the five byte control structure, then seeks to the next block. This process continues until the entire article is written. It's not really possible to break up these writes without giving up the linked list structure, and with it,
either the ability to rapidly grow the database, or the ability of the DB to reuse existing space as articles are expired.

My database format and engine implementations are robust enough to
cope with applications dying unexpectedly without finishing write
operations; they're not robust enough to handle boot-up CHKDSK
removing 80Kb of data from the end of a 100Kb file.

So you do have a syncpoint architecture, then?

<snip>

While I appricate your comments, what you suggest is vast overkill for my application. The NewsDB engine isn't sophisticated enough to support syncpoints or rollback. Think along the lines of Squish MSGAPI rather than DB2: NewsDB is basically Squish-for-Usenet. My intention is to produce a lightweight multiuser offline news system so small groups of users (1-25) can read Usenet offline without needing to install a full news server. As a lightweight alternative to a local news server, NewsDB doesn't need the overhead of a fully-fledged SQL engine.

NewsDB is decidedly not intended for mission critical environments; surviving Anything and Everything isn't part of the design requirements. Rather, my intention is to contain common errors to the extent that they can be repaired by automated repair tools.

This seems to me to be the type of activity you really want to perform.

One

of your problems is that your input stream is not persistent, as it would be a socket connected to a NNTP server [if I read your design correctly, and assume you are coding from the ground up]. This means that you need to be able to restart a failed instance of the application, resuming from its most recent succesful syncpoint. The usual method to deal with this is to use a log or journal file that keeps track of "in flight" transactions;

the

journal is where your I/O remains unbuffered. If your NNTP server allows you to re-fetch articles -- and most do -- you can keep your journal in

RAM

or on a RAMDISK; this prevents performance hits for doing short I/O's.

Just to clarify things: NewsDB isn't a single application. It's a RFC-based message base format similar in purpose to Squish and JAM. I'm writing an access library (NewsDBLib) to work with the NewsDB format. I'm also writing two applications (an importer and a reader) which use NewsDBLib.

At the moment, none of these applications download news. The importer reads SOUP packets from disk just so I can avoid messing with NNTP. Reading from disk also gives me the flexibility to, at a later date, import other packet formats such as UUCP news and FTN PKT.

--- GoldED/2 3.0.1
* Origin: Life sucks and then you croak. (1:250/820)

Who's Online
Recent Visitors
- Chris Jacobs
  Wed Nov 27 05:55:07 2024
  from Almere Netherlands via Telnet
- D
  Wed Nov 27 04:14:28 2024
  from Jupiter via SSH
- Xbit
  Tue Nov 26 15:22:26 2024
  from Hillsboro, Or via SSH
- Ted
  Tue Nov 26 09:25:23 2024
  from Va via SSH

System Info

Sysop:	digital man
Location:	Riverside County, California
Users:	1,043
Nodes:	16 (0 / 16)
Uptime:	89:35:22
Calls:	500,953
Calls today:	2
Files:	109,377
D/L today:	1,133 files (199M bytes)
Messages:	304,684

Committing file changes

Who's Online

Recent Visitors

System Info