The problem is how do you know what data to add, update or delete?
Diff-files tell you which records to add, which to replace, which
to delete and which to leave alone.
The diffs don't say "add 2:204/255..."; they say "copy (ignore) 17
lines", "delete 3 lines", "add the following 4 lines" and so on, so
you still need to have the original nodelist to be able to resolve
the diff into something useful. :-(
You are expected to work on complete records that have been
arranged in a given order -- what are you planning to do to the data
that you can't locate your records anymore?
Do you want to scatter your data all over the place without
creating an index file or what?
The problem is how do you know what data to add, update or delete?
Diff-files tell you which records to add, which to replace, which
to delete and which to leave alone.
Well, yes, but the "records" are lines (nodes, hosts, regions,
comments, errors etc.) in the text file nodelist, and the diff is
useless if you don't have the original file. You don't know if the
record to delete is a node (and its full node number if so).
The diffs don't say "add 2:204/255..."; they say "copy (ignore) 17
lines", "delete 3 lines", "add the following 4 lines" and so on, so
you still need to have the original nodelist to be able to resolve
the diff into something useful. :-(
You are expected to work on complete records that have been
arranged in a given order -- what are you planning to do to the data
that you can't locate your records anymore?
I'm importing the data into a db table and use the Fidonet address
as the primary key (but it's also indexed on e.g. sysop name &
location for faster searches). I'm perfectly able to locate the
data, just not via the line number the node had in the original
nodelist (which would be a pita to work with since it's likely to
change every week).
I'm sure it would be possible to make the table an exact copy of
the nodelist file (with line numbers and everything) but it would
be a very complex thing to work with and most db designers would
probably start crying if they saw it. It's simpler to truncate the
table and reload the entire nodelist from the updated file.
Do you want to scatter your data all over the place without
creating an index file or what?
Of course not.
If you know of a magic way to convert the nodediffs into SQL insert/update/delete commands please tell me how, but if you would
try this yourself I'm sure you'd see the problem.
You get a new diff each week to update last week's nodelist. As
long as you have kept your records in order (and why shouldn't you?),
you move them to an auxiliary base record by record, inserting the new nodes as per the diff telling you Axx, copying where it says Cxx and skipping where it says Dxx. Then kill the old file and rename the new
one to the old one's name.
Have you ever done serious work on nodelists?
The nodelist entries have no line numbers either. Their line
number is their offset in lines from the start of the list, prolog included.
You get a new diff each week to update last week's nodelist. As
long as you have kept your records in order (and why shouldn't you?),
you move them to an auxiliary base record by record, inserting the new nodes as per the diff telling you Axx, copying where it says Cxx and skipping where it says Dxx. Then kill the old file and rename the new
one to the old one's name.
Easy, ain't it? And that has been invented only eighteen years
ago. It's just come of age...
If you know of a magic way to convert the nodediffs into SQL
insert/update/delete commands please tell me how, but if you would
try this yourself I'm sure you'd see the problem.
I'm not going to look backwards, Jesper. All I can add here is to advise you to do some serious low-level programming. It's refreshing.
Do you know there is even elegance in writing assembler?
You get a new diff each week to update last week's nodelist. As
long as you have kept your records in order (and why shouldn't you?),
you move them to an auxiliary base record by record, inserting the new
nodes as per the diff telling you Axx, copying where it says Cxx and
skipping where it says Dxx. Then kill the old file and rename the new
one to the old one's name.
Thats extra complication that wouldn't be necessary if updates
specified the node. Line based updates also mean taking the table
offline (totally unacceptable) to do a complete copy/rename, using transactions (may not be available), or a post-diff comparison of
the old and new tables to determine whats changed and update the
live table accordingly (*shudder*).
I've never said you should do that on line. You can prepare all
your files in a separate operation and than swap them. Should take not more than a few seconds on the kind of system you apparently use...
Do I pass?
It would be interesting to see you actually implementing something
like that with SQL. Do that before you tell me how easy it is.
Easy, ain't it? And that has been invented only eighteen years
ago. It's just come of age...
I'm not saying that it's difficult if you're only working with
files, but I'm not.
I've never said you should do that on line. You can prepare all
your files in a separate operation and than swap them. Should take not
more than a few seconds on the kind of system you apparently use...
Downtime of any kind, even a few seconds, is totally unacceptable.
That's just a complete cop-out and poor design.
Sysop: | digital man |
---|---|
Location: | Riverside County, California |
Users: | 1,023 |
Nodes: | 17 (1 / 16) |
Uptime: | 181:07:51 |
Calls: | 502,435 |
Calls today: | 3 |
Files: | 100,650 |
D/L today: |
219 files (93,637K bytes) |
Messages: | 437,173 |