Issues with MQM3 demo solved.

Avatar de Usuario
Mensajes: 822
Registrado: 27 Sep 2015, 00:14
Ubicación: Jerez de la Frontera

Issues with MQM3 demo solved.

Mensaje por mcleod_ideafix » 02 Ago 2020, 21:58

This is a followup to a Facebook post about the MQM 3 demo (see here ... 9136822183 ). I'm glad to announce that the mistery has been solved :)

The symptons were as following: the second part of the MQM 3 - Total Brainstorm, from the MQM-Team, failed to "decrunch" and caused a reset on my machine (ZX-UNO).

I first tried to contact any of the MQM-Team members and so I posted a request. Meanwhile, I thought that it could be handy if I can give some clues to whoever might show up as member of the so called MQM-Team. So I started debugging the code.

First I tried to narrow the moment in which the bug manifested. Luckily, I can save and load snapshots in the ZX-UNO thanks to DivMMC and ESXDOS. I used also Spectaculator and SpecEmu. Using the integrated debugger of these emulators, I made some snapshots at different points of execution. All of them worked in the ZX-UNO.

Then I was earlier in the code each time until I found one point in which the snapshot failed to execute in the ZX-UNO. Some instructions later, the snapshot worked fine. That leaded me to the following code fragment:

Código: Seleccionar todo

:61df ed 7e           im 2           ;start of 1st snapshot: fails under ZXUNO
:61e1 21 ff ff        ld hl,0xffff
:61e4 36 c9           ld (hl),0xc9
:61e6 f9              ld sp,hl
:61e7 3e 3b           ld a,0x3b
:61e9 ed 47           ld i,a
:61eb 3e 38           ld a,0x38
:61ed 06 08           ld b,0x08
:61ef d9              exx
:61f0 21 00 58        ld hl,0x5800
:61f3 11 01 58        ld de,0x5801
:61f6 01 ff 02        ld bc,0x02ff
:61f9 77              ld (hl),a
:61fa ed b0           ldir
:61fc d6 08           sub 0x08
:61fe d9              exx
:61ff fb              ei
:6200 76              halt
:6201 fb              ei
:6202 76              halt
:6203 fb              ei
:6204 76              halt
:6205 10 e8           djnz 0x61ef
:6207 cd 52 00        call 0x0052  ;start of the second snapshot: success under ZXUNO
:620a 3b              dec sp
:620b 3b              dec sp
:620c e1              pop hl
So, something is wrong in this tiny fragment of code that causes, way later (dozens of millions of T-states), the second part of the demo to fail decrunching and executing.

The code starts with interrupts disabled. Interrupt mode 2 is selected, and a quick interrupt handler is installed by changing I to $3B, so the interrupt handler address is stored at address $3BFF. This address belongs to the ROM, in an area filled with FF bytes, so the final execution address for the interrupt handler is $FFFF. The code writes $C9 in this location. $C9 is the opcode or RET, so the interrupt handler is merely a RET instruction. When an interrupt handler is executed, the CPU clears the interrupt flag, so after RETurning, interrupts are disabled. This is why just before a HALT there is an EI instruction, to make sure that HALT will be executed with interrupts enabled.
The code from 61EB to 6205 is just an attribute fadeout routine, that cycles from white, yellow, cyan, etc, to black, changing the paper value of all attribute cells (with LDIR). The sequence of three HALTs allows the visual effect to be synchronized with the vertical blanking period, so there's no flickering. Note that we exit the fadeout loop with interrupts disabled.

A set of changes were applied to this snapshot to see what happened with each one (applied to the original snapshot, not cumulative changes):
1. Changed LDIR with NOPs. Snapshot fails.
2. Patched the snapshot so just after B is loaded with 8 (address 61ED), an inmediate jump is performed to address 6207. Snapshot works.
3. Patched the three EI HALT sequences with NOPs. Snapshot works.
4. Patched two of the three EI HALT sequences with NOPs. Snapshot fails.

Suspecting that the bug may be related with the length of the INT pulse (being too long and therefore retrigerring the interrupt handler when it's not supposed to do), I changed the $C9 value written to $FFFF to $18, the opcode for JR . The displacement for this instruction is taken from address $0000 at ROM, which is $F3, so this effectively jumps backwards into the code a dozen bytes or so. I put some NOPs and a RET. Now the interrupt handler spends a bit more T-states and won't be retriggered. Guess what? It didn't work.

The EI HALT issue was baffling. It failed, but still no idea why. I wanted to know what changed in memory in the ZXUNO versus the emulator just after the first part of this code has been executed (that is, when we are about to execute the instruction at address $6207). To get that, I needed to make a snapshot at the right moment. This is not a problem in an emulator, in which you can just put a breakpoint, wait it to trigger and then save the snapshot. For the ZXUNO, I patched the code beginning at $6203 with NOP , CALL $0066. That erases the third EI HALT sequence, and the DJNZ instruction, so after returning from the CALL, we are at $6203.

Calling $0066 is like pressing the NMI button on the DivMMC. ESXDOS shows up and I can save a snapshot of the current running program. I know that some values, like register R, won't be the same. Also, value for B register won't be the same also, but I wasn't interested in register values, but memory contents.

Comparing the snapshot taken with the emulator at $6203 with the one taken by ESXDOS, also at $6203 revealed me some things:
- Memory was not exactly the same. Leaving the attribute area, which of course it was different, there was a couple of changed bytes here and there, and at the end of the RAM, some others.
- I remember having swapped the RAM from one snapshot to the other and getting no conclusive results (sorry, this part of the analysis was not written down). The thing is that I discarded RAM contents to be the cause of the failure. Then I focused on the registers contents. The 27 byte header of a SNA file holds the contents of all CPU registers, interrupt state (disabled/enabled), interrupt mode, border color, etc. I knew some registers would have different values, but then I saw it...

... the snapshot from the emulator (the "good" one) stored "IM 2" as the current interrupt mode, but the snapshot from ESXDOS stored "IM 1" !!!!!!

How is that possible? The very first instruction executed is precisely IM 2.

Then I noticed the opcodes used for IM 2: ED 7E. I know some isntructions, specially if they need an ED prefix, can be decoded with more than one opcode. Is ED 7E the official opcode for IM 2???


It's ED 5E

According to , IM 2 can be decoded with either ED 5E or ED 7E . I quickly get into the very poorly documented (if any) VHDL code of the T80 core (T80_MCode.vdh) and found this, for the decoding of the IM 2 instruction:

Código: Seleccionar todo

			when "01011110"|"01110111" =>
				-- IM 2
				IMode <= "10";
Even if you are not used to hardware description languages, I think it's rather easy to see what's happening here: if the opcode (after ED) is 01011110 (5E) or 01110111 (77 !!) then set the interrupt mode to 10 (2)

I inmediately changed that into:

Código: Seleccionar todo

			when "01011110"|"01111110" =>
				-- IM 2
				IMode <= "10";
And suddently, it all worked again :)

After this, I've tracked down all versions of the ZXUNO Spectrum core to see that nearly all of them use a T80 core with the same bug. Only one of them used a slighly different version, in which the IM 2 decoding was bugfree. In fact, I remember having run this MQM3 demo with no problems in the past.
BTW: the T80 used in the TBBlue core has the same bug. Luckily, the opcode $7E has not been taken for one of the new instructions, so it's very easy to fix :)
ZX-Uno · Clon de ordenador ZX Spectrum basado en FPGA

Avatar de Usuario
Mensajes: 154
Registrado: 07 Oct 2015, 13:32

Re: Issues with MQM3 demo solved.

Mensaje por aowen » 06 Ago 2020, 12:04

Are the alternate instructions for IM0 and IM1 also supported?

Avatar de Usuario
Mensajes: 848
Registrado: 05 Ago 2016, 22:33

Re: Issues with MQM3 demo solved.

Mensaje por desUBIKado » 09 Ago 2020, 12:21

¿Cuándo saldrán nuevas versiones del core de Spectrum con este fallo corregido?