• Squish __ftsc_date bug / JAM bug

    From andrew clarke@3:633/267 to Oli on Mon Feb 15 13:33:48 2021
    14 Feb 21 16:47, you wrote to me:

    But maybe the correct code should be:

    if (*msg->__ftsc_date)
    {
    ftsdate = msg->__ftsc_date;
    }
    else
    {
    scombo = (SCOMBO *)(&(msg->date_written));
    scombo = TmDate_to_DosDate(s_time, scombo);
    ftsdate = (unsigned char *)sc_time(scombo, (char
    *)(msg->__ftsc_date)); }

    Mhh, a JAM message base doesn't have an __ftsc_date field, I don't understand this piece of the code.

    Yeah, disregard that, I wrote it in a hurry.

    I've also never used JAM much so don't recall much about the file format, but see below.

    dosdate->msg_st.time.mm = tmdate->tm_min;
    dosdate->msg_st.time.ss = tmdate->tm_sec >> 1;
    ^^^^^^^^^^^^
    }

    return dosdate;
    }

    AFAIK the JAM format doesn't use DOS Time at all. It seems to be an unnecessary lossy conversion.

    Scott Dudley's MSGAPI uses lossy timestamps internally for XMSG->date_written and XMSG->date_arrived and that's folded into the SMAPI fork because nobody's ever thought to change it. See _stamp in huskylib/cvtdate.h.

    (The _stamp struct is in Huskylib now, but it was originally self-contained in SMAPI, which predates the Husky project.)

    Scott's idea of using his lossy timestamps to recreate XMSG->__ftsc_date is nonsense and as you know only has a 50/50 chance of working accurately.

    Now, the Squish format stupidly uses those lossy timestamps but there's no reason the whole API has to. Conceivably that could be fixed with a bit of work so as not to corrupt Squish bases.

    In the mean time JAM's msgh->Hdr.DateWritten is a 32-bit unsigned integer. (Fortunately being unsigned it will wrap around in 2106, not in 2038.)

    The JAM spec says:

    "An ulong representing the number of seconds since midnight, January 1, 1970."

    Presumably that's 1970-01-01 00:00 UTC, not local time.

    When writing messages, JamWriteMsg() calls ConvertXmsgToJamHdr() (which is not used by JamReadMsg), which does this:

    jamhdr->DateWritten = mktime(ptm) + gettz();

    One problem here is that ptm is assigned the return value of DosDate_to_TmDate(), which I suspect is lossy. I don't know why the code does that.

    That value is then fed into mktime() to return the number of seconds since 1970-01-01, then a timezone offset is added with gettz().

    But why is gettz() called every time JamWriteMsg() is called? Wouldn't that mean HPT adds the local timezone to the DateWritten field every time it tosses a message to a JAM base? That's a bug, isn't it? Unless I'm reading the code wrong.

    I mean, JamReadMsg() could convert msgh->Hdr.DateWritten to FTSC date format then write that to msg->__ftsc_date, but will the __ftsc_date of rescanned messages have the same timestamp they originated with, or will they be offset by local time?

    If the latter, it's not just a matter of reversing things by subtracting the offset of local time, because the local time zone can change due to daylight savings.

    Though I just noticed gettz() seems to unexpectedly disable DST when it calculates the time zone offset! So maybe you /could/ subtract the offset and the rescanned timestamps would be accurate... unless you moved into a different time zone.

    api_jam.c also hints at supporting the TZUTC kludge, but it does nothing to calculate dates with it. It just stores it as a JAM subfield because the spec says it can.

    The ftsdate variable in JamReadMsg() doesn't actually seem to be used. Abandoned code, I guess.

    --- GoldED+/BSD 1.1.5-b20180707
    * Origin: Blizzard of Ozz, Melbourne, Victoria, Australia (3:633/267)
  • From andrew clarke@3:633/267 to Michael Dukelsky on Mon Feb 15 15:10:56 2021
    14 Feb 21 22:50, you wrote to me:

    "ftsdate" variable is assigned but it is never used, so I've already commented it out but not in master branch yet.

    I see that now, thanks.

    Instead of commenting-out code I like to use #if 0, and explain why:

    #if 0
    /* this code is deliberately broken, never run it */

    p = 0;
    strcpy(p, "AAA");
    #endif

    --- GoldED+/BSD 1.1.5-b20180707
    * Origin: Blizzard of Ozz, Melbourne, Victoria, Australia (3:633/267)
  • From Oli@2:280/464.47 to andrew clarke on Mon Feb 15 13:31:23 2021
    andrew wrote (2021-02-15):

    Scott's idea of using his lossy timestamps to recreate XMSG->__ftsc_date is nonsense and as you know only has a 50/50 chance of working accurately.

    I didn't think he expected it to work accurately, more like it is not relevant to preserve the exact time and 2 second resolution is good enough. The dupe checker in Squish also uses a 2 second granularity.

    As a human I really don't care about the seconds, but for machine processing I see no benefits in modifying the time.

    Now, the Squish format stupidly uses those lossy timestamps but there's
    no reason the whole API has to. Conceivably that could be fixed with a
    bit of work so as not to corrupt Squish bases.

    Not sure if I understand. What kind of potential corruption do you have in mind?

    In the mean time JAM's msgh->Hdr.DateWritten is a 32-bit unsigned
    integer. (Fortunately being unsigned it will wrap around in 2106, not in 2038.)

    The JAM spec says:

    "An ulong representing the number of seconds since midnight, January 1, 1970."

    Presumably that's 1970-01-01 00:00 UTC, not local time.

    I don't think it's meant to be UTC. From the JAM spec:

    (1) All timestamps created locally (i.e. those not imported from
    other systems) are stored in local time.

    The DateWritten field is also used for imported messages. Why should the time ever be converted to UTC? Wouldn't that make everything more complicated? UNIX time in JAM is not a real Unix timestamp, more UNIX-style (UNIXish).

    That value is then fed into mktime() to return the number of seconds
    since 1970-01-01, then a timezone offset is added with gettz().

    which seems to be correct behavior.

    But why is gettz() called every time JamWriteMsg() is called? Wouldn't that mean HPT adds the local timezone to the DateWritten field every time it tosses a message to a JAM base? That's a bug, isn't it? Unless I'm reading the code wrong.
    [...]

    I haven't looked at the code, but wouldn't this already be discovered, if it were a problem?

    api_jam.c also hints at supporting the TZUTC kludge, but it does nothing to calculate dates with it. It just stores it as a JAM subfield because the spec says it can.

    It'd the JAM way. It stores a few important kludges in special fields (like MSGID, REPLY, TZUTC, Via, PATH, SEENBY).

    ---
    * Origin: . (2:280/464.47)
  • From andrew clarke@3:633/267 to Oli on Tue Feb 16 00:03:06 2021
    15 Feb 21 13:31, you wrote to me:

    Scott's idea of using his lossy timestamps to recreate
    XMSG->__ftsc_date is nonsense and as you know only has a 50/50
    chance of working accurately.

    I didn't think he expected it to work accurately, more like it is not relevant to preserve the exact time and 2 second resolution is good enough. The dupe checker in Squish also uses a 2 second granularity.

    But it's necessary to work accurately if you're doing a rescan. That's the whole point of this bug and is probably why Scott had to store the ftsc_date field.

    It's been so long since I used the original SquishMail (ie. squish.exe) that I don't recall if rescans were a supported feature.

    ("squish.exe rescan 3:633/267 fidosoft.husky" maybe?)

    Now, the Squish format stupidly uses those lossy timestamps but
    there's no reason the whole API has to. Conceivably that could be
    fixed with a bit of work so as not to corrupt Squish bases.

    Not sure if I understand. What kind of potential corruption do you have
    in mind?

    The API just needs to use 2 second resolution for I/O on Squish bases and 1 second resolution everywhere else.

    But currently I suspect if you change _stamp in the API to allow 1 second resolution it will break the Squish code.

    It will probably also break the ABI if someone is using a DLL version of SMAPI and tries to mix old & new versions, but that situation isn't unique to HPT/SMAPI.

    All of that can be ironed out though.

    In the mean time JAM's msgh->Hdr.DateWritten is a 32-bit unsigned
    integer. (Fortunately being unsigned it will wrap around in 2106,
    not in 2038.)

    The JAM spec says:

    "An ulong representing the number of seconds since midnight,
    January 1, 1970."

    Presumably that's 1970-01-01 00:00 UTC, not local time.

    I don't think it's meant to be UTC. From the JAM spec:

    (1) All timestamps created locally (i.e. those not imported from
    other systems) are stored in local time.

    The DateWritten field is also used for imported messages. Why should the time ever be converted to UTC? Wouldn't that make everything more complicated? UNIX time in JAM is not a real Unix timestamp, more UNIX-style (UNIXish).

    You could be right, but at the very least it's ambiguous since AFAIK the convention is for time_t on Linux/BSD be stored as UTC.

    But at this point it's really just a matter of throwing mud at the code to see what sticks...

    --- GoldED+/BSD 1.1.5-b20180707
    * Origin: Blizzard of Ozz, Melbourne, Victoria, Australia (3:633/267)
  • From Oli@2:280/464.47 to andrew clarke on Mon Feb 15 15:23:21 2021
    andrew wrote (2021-02-16):

    It's been so long since I used the original SquishMail (ie. squish.exe) that I don't recall if rescans were a supported feature.

    ("squish.exe rescan 3:633/267 fidosoft.husky" maybe?)

    SQUISH RESCAN <area_tag> <node>

    The JAM spec says:

    "An ulong representing the number of seconds since midnight,
    January 1, 1970."

    Presumably that's 1970-01-01 00:00 UTC, not local time.

    I don't think it's meant to be UTC. From the JAM spec:
    [...]
    You could be right, but at the very least it's ambiguous since AFAIK the convention is for time_t on Linux/BSD be stored as UTC.

    especially if it's called "UNIX date" in the spec. If it's not a real UNIX time stamp (which is always in UTC), what is it then? And what is the exact meaning of "the number of seconds since midnight, January 1, 1970" when UNIX time is "the number of seconds that have elapsed since the Unix epoch, minus leap seconds"? ;-P

    ---
    * Origin: . (2:280/464.47)
  • From Kai Richter@2:240/77 to andrew clarke on Mon Feb 15 21:39:22 2021
    Hello andrew!

    16 Feb 21, andrew clarke wrote to Oli:

    But it's necessary to work accurately if you're doing a rescan.

    It's necessary to do the same workaround again. ;-)

    That's the whole point of this bug

    The bug is that the normal scan and the rescan created different DateTime.

    I don't understand what you are doing but it looks like you do some re-invention. Isn't it easier to look how the scan.c does the export and use the same code for the rescan?

    Regards

    Kai

    --- GoldED+/LNX 1.1.4.7
    * Origin: Monobox (2:240/77)
  • From andrew clarke@3:633/267 to Kai Richter on Tue Feb 16 15:30:28 2021
    15 Feb 21 21:39, you wrote to me:

    The bug is that the normal scan and the rescan created different
    DateTime.

    I don't understand what you are doing but it looks like you do some re-invention. Isn't it easier to look how the scan.c does the export and use the same code for the rescan?

    To summarise:

    "toss" shouldn't need the message base, instead just memcpy()ing the ftscdate field between .PKTs. This is how PassThru areas work but should also be true for non-PassThru.

    "scan" only exports local messages which in the case of a Squish base, only the DateWritten field is supposed to be written to by the mail reader, though Oli recently said that GoldED apparently writes the ftscdate field as well (which then means you can have a one second discrepancy between fields for the same locally written message...).

    "rescan" exports both local and non-local messages, which can have a combination of DateWritten field ONLY or DateWritten-and-ftscdate, and HPT incorrectly uses the DateWritten field for non-local messages, which only has a two second resolution in Squish.

    I can't say any more about JAM other than what I've already written.

    The *.MSG format stores the date solely in ftscdate format, so HPT should really have no choice but to memcpy() the ftscdate field into the outbound PKT on a rescan.

    So obviously the moral of the story is for everyone to use *.MSG. ;)

    --- GoldED+/BSD 1.1.5-b20180707
    * Origin: Blizzard of Ozz, Melbourne, Victoria, Australia (3:633/267)
  • From Kai Richter@2:240/77 to andrew clarke on Tue Feb 16 16:29:02 2021
    Hello andrew!

    16 Feb 21, andrew clarke wrote to Kai Richter:

    Isn't it easier to look how the scan.c does the export and use the
    same code for the rescan?

    To summarise:

    "toss" shouldn't need the message base, instead just memcpy()ing the ftscdate field between .PKTs. This is how PassThru areas work but
    should also be true for non-PassThru.

    Thanks, i hadn't that in my mind. I thought htp toss would toss into the messagebase only. To be honest i never paid attention at which run the downlink packages were build. Oh wait, are you talking about toss.c the code but not the function hpt toss?

    Regards

    Kai

    --- GoldED+/LNX 1.1.4.7
    * Origin: Monobox (2:240/77)
  • From andrew clarke@3:633/267 to Kai Richter on Wed Feb 17 19:40:02 2021
    On Tue 2021-02-16 16:29:02, Kai Richter (2:240/77) wrote to andrew clarke:

    Isn't it easier to look how the scan.c does the export and use the
    same code for the rescan?

    To summarise:

    "toss" shouldn't need the message base, instead just memcpy()ing
    the ftscdate field between .PKTs. This is how PassThru areas work
    but should also be true for non-PassThru.

    Thanks, i hadn't that in my mind. I thought htp toss would toss into the messagebase only. To be honest i never paid attention at which run the downlink packages were build. Oh wait, are you talking about toss.c the code but not the function hpt toss?

    Both I suppose. hpt toss is implemented by toss.c so they're one and the same.

    It's possible to have a completely passthru HPT configuration without a messagebase, with the exception of NetmailArea/DupeArea/BadArea.
    Though if you wanted to go full Marie Kondo you could point those to /dev/null!

    --- GoldED+/BSD 1.1.5-b20180707
    * Origin: Blizzard of Ozz, Melbourne, Victoria, Australia (3:633/267)