Forum: Vertrauen

Nodediffs & db:s

From Jesper S÷rensen@2:204/255 to Jan Vermeulen on Mon Jan 6 22:34:13 2003

The problem is how do you know what data to add, update or delete?

Diff-files tell you which records to add, which to replace, which
to delete and which to leave alone.

Well, yes, but the "records" are lines (nodes, hosts, regions, comments, errors
etc.) in the text file nodelist, and the diff is useless if you don't have the original file. You don't know if the record to delete is a node (and its full node number if so).

The diffs don't say "add 2:204/255..."; they say "copy (ignore) 17
lines", "delete 3 lines", "add the following 4 lines" and so on, so
you still need to have the original nodelist to be able to resolve
the diff into something useful. :-(

You are expected to work on complete records that have been
arranged in a given order -- what are you planning to do to the data
that you can't locate your records anymore?

I'm importing the data into a db table and use the Fidonet address as the primary key (but it's also indexed on e.g. sysop name & location for faster searches). I'm perfectly able to locate the data, just not via the line number the node had in the original nodelist (which would be a pita to work with since
it's likely to change every week).

I'm sure it would be possible to make the table an exact copy of the nodelist file (with line numbers and everything) but it would be a very complex thing to
work with and most db designers would probably start crying if they saw it. It's simpler to truncate the table and reload the entire nodelist from the updated file.

Do you want to scatter your data all over the place without
creating an index file or what?

Of course not.

If you know of a magic way to convert the nodediffs into SQL insert/update/delete commands please tell me how, but if you would try this yourself I'm sure you'd see the problem.

Jesper,
yeppe@enjoy.cc
---
* Origin: Singularity/2 - Swedish Internet Backbone (2:204/255)

From Jan Vermeulen@2:280/100 to Jesper S÷rensen on Tue Jan 7 01:37:34 2003

Quoting Jesper S÷rensen on Mon 6 Jan 2003 22:34 to Jan Vermeulen:

The problem is how do you know what data to add, update or delete?

Diff-files tell you which records to add, which to replace, which
to delete and which to leave alone.

Well, yes, but the "records" are lines (nodes, hosts, regions,
comments, errors etc.) in the text file nodelist, and the diff is
useless if you don't have the original file. You don't know if the
record to delete is a node (and its full node number if so).

Have you ever done serious work on nodelists?

If your diff says D5 you just delete those 5 lines. It is not necessary to go in savvy mode and salvage a record.

And yes, of course, you will need to place each node between brackets as in
<record> this is a text line <record/>.

The diffs don't say "add 2:204/255..."; they say "copy (ignore) 17
lines", "delete 3 lines", "add the following 4 lines" and so on, so
you still need to have the original nodelist to be able to resolve
the diff into something useful. :-(

You are expected to work on complete records that have been
arranged in a given order -- what are you planning to do to the data
that you can't locate your records anymore?

I'm importing the data into a db table and use the Fidonet address
as the primary key (but it's also indexed on e.g. sysop name &
location for faster searches). I'm perfectly able to locate the
data, just not via the line number the node had in the original
nodelist (which would be a pita to work with since it's likely to
change every week).

The nodelist entries have no line numbers either. Their line number is their offset in lines from the start of the list, prolog included.

You get a new diff each week to update last week's nodelist. As long as you
have kept your records in order (and why shouldn't you?), you move them to an auxiliary base record by record, inserting the new nodes as per the diff telling you Axx, copying where it says Cxx and skipping where it says Dxx. Then
kill the old file and rename the new one to the old one's name.

Easy, ain't it? And that has been invented only eighteen years ago. It's just come of age...

I'm sure it would be possible to make the table an exact copy of
the nodelist file (with line numbers and everything) but it would
be a very complex thing to work with and most db designers would
probably start crying if they saw it. It's simpler to truncate the
table and reload the entire nodelist from the updated file.

You could, but you don't to have to.

Do you want to scatter your data all over the place without
creating an index file or what?

Of course not.

I thought so ;-)

If you know of a magic way to convert the nodediffs into SQL insert/update/delete commands please tell me how, but if you would
try this yourself I'm sure you'd see the problem.

I'm not going to look backwards, Jesper. All I can add here is to advise you to do some serious low-level programming. It's refreshing.

Do you know there is even elegance in writing assembler?

-=<[ JV ]>=-

* Origin: The Poor Man's Workstation -- Wormerveer NL (2:280/100)

From Scott Little@3:712/848 to Jan Vermeulen on Tue Jan 7 19:53:50 2003

[ 07 Jan 03 01:37, Jan Vermeulen wrote to Jesper S÷rensen ]

You get a new diff each week to update last week's nodelist. As
long as you have kept your records in order (and why shouldn't you?),
you move them to an auxiliary base record by record, inserting the new nodes as per the diff telling you Axx, copying where it says Cxx and skipping where it says Dxx. Then kill the old file and rename the new
one to the old one's name.

Thats extra complication that wouldn't be necessary if updates specified the node. Line based updates also mean taking the table offline (totally unacceptable) to do a complete copy/rename, using transactions (may not be available), or a post-diff comparison of the old and new tables to determine whats changed and update the live table accordingly (*shudder*).

-- Scott Little [fidonet#3:712/848 / sysgod@sysgod.org]

--- FMail/Win32 1.60+
* Origin: Cyberia: All your msgbase are belong to us! (3:712/848)

From Jesper S÷rensen@2:204/255 to Jan Vermeulen on Tue Jan 7 12:44:34 2003

Have you ever done serious work on nodelists?

That depends on what you mean by serious work. I've written several nodelist processing tools, including a flag checker, a CC utility (to "mail bomb" all downlinks of a *C) and some half working MakeNL clones in both C++ and Java (maybe I have some Pascal code left too?). The most recent work I did was to write some simple scripts in Awk and Perl to convert the nodelist into SQL and XML so I think I know what I need to know about nodelists. Do I pass?

The nodelist entries have no line numbers either. Their line
number is their offset in lines from the start of the list, prolog included.

That's what I meant.

You get a new diff each week to update last week's nodelist. As
long as you have kept your records in order (and why shouldn't you?),
you move them to an auxiliary base record by record, inserting the new nodes as per the diff telling you Axx, copying where it says Cxx and skipping where it says Dxx. Then kill the old file and rename the new
one to the old one's name.

It would be interesting to see you actually implementing something like that with SQL. Do that before you tell me how easy it is.

Easy, ain't it? And that has been invented only eighteen years
ago. It's just come of age...

I'm not saying that it's difficult if you're only working with files, but I'm not.

If you know of a magic way to convert the nodediffs into SQL
insert/update/delete commands please tell me how, but if you would
try this yourself I'm sure you'd see the problem.

I'm not going to look backwards, Jesper. All I can add here is to advise you to do some serious low-level programming. It's refreshing.

Low level processing of diffs is super simple but that's not what I want to do.
I want to update the nodelist in my db which I'm using from my Fidonet client right now.

Do you know there is even elegance in writing assembler?

Almost all kinds of programming have their elegance (well, maybe not VB ;-). I've written some assembler (for Motorola 68k and Intel 8088 processors) but that was 10-15 years ago. It has its charm but it's not very suitable for anything "bigger" if you ask me.

The fact that I nowadays mainly use Java doesn't mean I don't know anything about lower level languages. I use Java because I like it and because it's very
suitable for the kind of software I'm currently developing, not because it's the only language I know.

Jesper,
yeppe@enjoy.cc
---
* Origin: Singularity/2 - Swedish Internet Backbone (2:204/255)

From Jan Vermeulen@2:280/100 to Scott Little on Tue Jan 7 12:48:40 2003

Quoting Scott Little on Tue 7 Jan 2003 19:53 to Jan Vermeulen:

You get a new diff each week to update last week's nodelist. As
long as you have kept your records in order (and why shouldn't you?),
you move them to an auxiliary base record by record, inserting the new
nodes as per the diff telling you Axx, copying where it says Cxx and
skipping where it says Dxx. Then kill the old file and rename the new
one to the old one's name.

Thats extra complication that wouldn't be necessary if updates
specified the node. Line based updates also mean taking the table
offline (totally unacceptable) to do a complete copy/rename, using transactions (may not be available), or a post-diff comparison of
the old and new tables to determine whats changed and update the
live table accordingly (*shudder*).

I've never said you should do that on line. You can prepare all your files in a separate operation and than swap them. Should take not more than a few seconds on the kind of system you apparently use...

-=<[ JV ]>=-

* Origin: The Poor Man's Workstation -- Wormerveer NL (2:280/100)

From Scott Little@3:712/848 to Jan Vermeulen on Wed Jan 8 04:31:38 2003

[ 07 Jan 03 12:48, Jan Vermeulen wrote to Scott Little ]

I've never said you should do that on line. You can prepare all
your files in a separate operation and than swap them. Should take not more than a few seconds on the kind of system you apparently use...

Downtime of any kind, even a few seconds, is totally unacceptable. That's just
a complete cop-out and poor design.

-- Scott Little [fidonet#3:712/848 / sysgod@sysgod.org]

--- FMail/Win32 1.60+
* Origin: Cyberia: All your msgbase are belong to us! (3:712/848)

From Jan Vermeulen@2:280/100 to Jesper S÷rensen on Tue Jan 7 21:47:16 2003

Quoting Jesper S÷rensen on Tue 7 Jan 2003 12:44 to Jan Vermeulen:

Do I pass?

You're not entirely hopeless ;-)

It would be interesting to see you actually implementing something
like that with SQL. Do that before you tell me how easy it is.

You are the one wanting to unsimplify the problem, no me.

Easy, ain't it? And that has been invented only eighteen years
ago. It's just come of age...

I'm not saying that it's difficult if you're only working with
files, but I'm not.

I am, and sofar I'm satisfied with what I get.

I'm putting an end to this thread now, Jesper. What you and the others want
to do is far to early. First of all we'll need to eliminate a small number of bugs in the current nodelist operation and I intend to work on it from now on.

It's been fun sofar and I have no doubt that we'll met again later.

-=<[ JV ]>=-

* Origin: The Poor Man's Workstation -- Wormerveer NL (2:280/100)

From Jan Vermeulen@2:280/100 to Scott Little on Tue Jan 7 21:55:00 2003

Quoting Scott Little on Wed 8 Jan 2003 4:31 to Jan Vermeulen:

I've never said you should do that on line. You can prepare all
your files in a separate operation and than swap them. Should take not
more than a few seconds on the kind of system you apparently use...

Downtime of any kind, even a few seconds, is totally unacceptable.
That's just a complete cop-out and poor design.

I'm going back to the immediate problems. I've alread spoiled to much time in this thread.

TTUL.

-=<[ JV ]>=-

* Origin: The Poor Man's Workstation -- Wormerveer NL (2:280/100)

Who's Online
Recent Visitors
- Spaceboy
  Fri Jul 26 15:31:36 2024
  from Usa via Telnet
- Spaceboy
  Fri Jul 26 14:32:46 2024
  from Usa via Telnet
- Rixter
  Fri Jul 26 14:18:44 2024
  from Madison, NC via Telnet
- Pamafa
  Fri Jul 26 12:51:05 2024
  from Torino, Italy via Telnet

System Info

Sysop:	digital man
Location:	Riverside County, California
Users:	1,029
Nodes:	15 (0 / 15)
Uptime:	21:45:39
Calls:	20
Calls today:	9
Files:	95,114
D/L today:	10,091 files (1,256M bytes)
Messages:	295,631
Posted today:	1

Nodediffs & db:s

Who's Online

Recent Visitors

System Info