Old News - 2008
[ 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 | 1999 ]

Old News (9/7)

New PAL reader

This week I finished assembling and testing the third version of the PAL reader hardware. Both 20-pin (16L8) and 24-pin (22L10) devices can be dumped, though the analysis software needs some considerable changes to support 24-pin chips. I'm moving towards a more device-independent way of analysis that could support other programmable logic devices of varying input and output pin counts, though I don't have plans at the present to make a PAL reader capable of doing that.

The power switch (ST890C) works very nicely, it has an output that is asserted when there is a short circuit or an overheating condition and will cycle its power output when it cools down sufficiently to continue. This caught a short circuit on the PCB as I noticed the "fault" LED was always lit. I tested a few registered PALs and the power-up reset works as expected, the only quirk was that the clock input needs to be low for the reset to happen. If it's held high, not all of the macrocells are reset to '1'. This will allow all possible transitions from the reset state to the next state to be mapped, which the previous versions of the PAL reader couldn't do without manually cycling the power. Not something you'd want to do 256 times.

What didn't work was the DS1233 reset generator, which should keep the /RESET output low for short while after the ST890C power output has reached a valid level. Instead, it directly follows the power output state, low when off and high when on. Without that necessary delay I can't accurately determine when the 'settling' time for the PAL has elapsed, at which point its power-on reset sequence is complete. A workaround was to insert a short delay on the PC side, but I would much rather find a better solution. There is an expansion connector intended for a PIC12F675 with all the power and related control signals available, so it should be possible to find a more suitable part or circuit and connect it there, or just use a PIC to monitor the output. And if the software was informed about the kind of PAL being dumped, it could poll the PAL outputs until they were reset though that would increase the time it took to process a registered PAL.

Overall I am quite pleased with the results. The design is becoming more refined and the capabilities are increasing. Once it's ready for a release I will update the PAL reader package with the new files, PCB layout, and software.

Old News (8/6)

PAL feedback

I had received reports of PALs that couldn't be replaced because they used more input terms than a PAL supports. While the results were functionally correct and would compile for a virtual device, the results wouldn't fit in a replacement GAL16V8. A shortcoming in the analysis process was revealed where feedback wasn't properly taken into consideration.

PALs have feedback taken from the point between the output buffer and bonding pad, which can be used as an input term by other outputs. Because the feedback is tapped at this particular point, it can be driven by the PALs output, or when the pin is tristated it can be driven by an external circuit. This allows for bidirectionality instead of forcing a pin to be input or output only.

Here is an example of how feedback can be used to associate one the result of one combinatorial output as the input to another:

    B0 =  I0 &  I1 &  I2;           // B0 uses three input terms
    B1 =  I0 &  I1 &  I2 & !I7;     // B1 uses four input terms
 or:
    B1 =  B0 & !I7;                 // B1 only uses two input terms (B0 feedback and I7) but corresponds to four

I had assumed WinCUPL would be able to minimize equations by taking the feedback of other outputs into account. But it doesn't, and for good reason: this is an unsafe optimization. If the pin in question was ever tristated then the feedback wouldn't represent the result of the logic associated of it, instead it would assume whatever state external circuitry on the pin set it to. Another less likely situation is that the output buffer of the PAL could be overridden if it was strongly driven high or low, say during a bus conflict, which would again make the feedback incorrect.

From a design standpoint this isn't really an issue as the designer would always know if they wanted to use feedback or not. In retrospect it's probably pointless for the design software to try to work feedback into the optimization process because it would be explicitly specified. In terms of reverse engineering, this adds extra complexity to the analysis involved. I have started updating the software to take this into account, and it will likely take a while to work out all the associated issues.

On the flip side it's nice to know that out of the many PALs people have dumped this was the only significant issue and that the results weren't really even incorrect, just not optimal. Special thanks to Nicola Salmoria for making sense out of the equations, and Chris Hardy and Corrado Tomaselli for providing PAL dumps, tracing PCB connections, and sharing information.

Old News (7/26)

Registered PALs

I have been working with Chris Hardy to determine how the Bagman security chip works. It is a PAL16R6 clocked independently of the CPU at about 30 Hz. A new 6-bit random number is generated each time and can be read by the Z80 as needed. It generate 55 unique values before the sequence repeats.

There are many different ways the internal storage (state data) of a PAL can be linked together to form elements like shift registers, latches, and counters. Rather than try to determine these very specific configurations, it's easier to look at the PAL from a generic point of view. The Bagman PAL has 6 registered macrocells and 8 inputs, so that's 64 possible states and 256 possible input configurations per state.

This can be represented by a 'transition table', which says which state will be selected when the device is clocked, based on the current state and input settings. A problem arises as you fill out the table by feeding data to the PAL and recording the results: eventually you may reach a point where a very specific sequence of states has to be stepped through in order to reach the 'target' state you are mapping the inputs for. The rate at which the table is filled drops dramatically when this happens.

To get around this I added code that checked if the current state was completely mapped out, and if so, it would find the path of state transitions that was necessary to reach the target state that needed further mapping. If no such path exists from the current state, a different state is selected and the path check is made again. This allowed all the input lists to get completely filled out for every state that was available.

When analyzing the inputs, it was surprising to see that none of them were used by the registered logic. The address bus is available to the PAL, but does not affect random number generator. The bit from the timing circuit also wasn't used, and functioned independantly as an inverter. This greatly simplifies how the PAL functions as none of the inputs need to be taken into consideration when examining state changes. It's also highly unusual in terms of your typical registered PAL implementation.

The next step was to create a replacement. Some of the initial designs we tested for a 16V8 didn't work, so I moved to a 22V10 (the only GAL I can program), dumping it through the PAL reader using a 22V10 to 16V8 adapter I made for the earlier CPS-2 project. As it turns out there is a strange quirk in WinCUPL; the state machine description which I generated from the transition table would compile, but only the optimized results (represented as logic equations) would generate a valid JED file for the 16V8. This was easy to fix, just cut and paste the results from the simulator output back into the CUPL source. But it is unfortunate this happens at all. Eventually we had a working GAL16V8 replacement for the Bagman PAL that produced identical output as the original part. Problem solved!

Old News (7/9)

PAL Device Reader

Here are the current WIP project files for the PAL device reader. I don't have all the notes for the second revision of the hardware included yet.

I've been working on the PCB layout for the third revision of the PAL reader. This new version will support 24-pin devices like the 20L8 and 20L10. It has a software controlled power switch so that registered devices can be initialized to a known state in a consistent manner by causing a power-on reset. If this works as expected, it should allow registered devices such as the PAL16R8 to be dumped. If it doesn't, the extra circutry can be bypassed with a jumper and it will function normally.

I have also tried to address usability issues; there are mounting holes for standoffs, all parts except a SOIC-8 chip are socketed for easy repair, and there is more clearance around the PAL socket. There is still a lot to do as the new design has to be tested to confirm its operation, and the dumping and analysis software will need some considerable rework for the larger devices. But this is another step towards a better solution for dumping PALs.

News (6/18)

Comms Link USB Replacement

I had worked on a Comms Link compatible device in 2007 to allow development of Sega Saturn software without having to use an old ISA-equipped computer. It had some timing issues that I wasn't able to diagnose at the time. I decided to revisit the project and dumped the PALCE20V8H chip from a clone Comms Link ISA card to see how the communication logic was implemented. Turns out one of the strobe signals need to be active high instead of active low, something that was fixed with a minor tweak to the GAL equations. Now that it works as expected, I've decided to clean up the project and release it. Here is the PCB layout, GAL equations, and source code:

For developing large programs, the transfer utility can load data from disk and transfer it over the Comms Link interface. This allows you to simulate a CD environment without burning discs for each test.

The example program displays a 1024x512 bitmap on NBG0 and NBG1 and zooms them across the range of expansion/shrink values the VDP2 supports, down to 1/4 of the original size. Transparency is enabled across both layers and the back color screen. At startup it loads the bitmap and palette data from disk, using the PC file access routines in filesat.c. Also note that crt0.s has an implementation of the Action Replay BIOS functions for downloading and uploading data written in SH-2 assembly, which is a little faster than the transfer code in the AR firmware.

Old News (6/14)

Racin' Force

I've been running some tests on Racin' Force, a Konami System GX game that uses voxels to generate a 3D playfield. It's an interesting combination of new and old hardware, the PSAC2 chip from earlier games generates a 2D ROZ layer which selects data from a color map and height map. This is used by the PSAC4 chip to render voxels to a framebuffer. An additional list of per-scanline camera data is provided as the PSAC4 knows nothing about the rotation parameters the PSAC2 applied to the color map and height map. Priority is specified for each voxel drawn so sprites can move in front of and behind specific parts of the landscape.

A limitation of this setup is that dropped pixels in the PSAC2 output result in missing voxel columns which makes landscape structures look flickery. However at 60 frames per second the results are extremely impressive. Consider how few custom ICs were needed to do this compared to the 3D hardware some arcade games used in 1993. For games with simple terrain such as Racin' Force and Konami's Open Golf Championship, the decision to use voxel graphics was well suited to their graphical needs and a cost-effective choice.

I made a low-quality video of the game in action. No controls are wired up, so the car is out of control after the attract sequence.

Of particular interest are the sloped curves on the racetrack, and the tunnel which has both a floor and ceiling made out of voxels. The PSAC4 chip has a lot of different drawing settings which have to get figured out. I've documented the PSAC2 registers and will tackle the PSAC4 next.

Game Genie information

Here is a description of the Genesis and SNES Game Genie, and a description of the SGA001 ASIC that Codemasters designed:

A useful property of both Game Genie devices is that you can replace the program ROM(s) with an EPROM emulator to run your own code, then relocate to RAM and enable the plugged-in cartridge to run experiments on the cartridge hardware.

Old News (6/1)

16-bit EPROM emulator

Last year I designed a 16-bit EPROM emulator based around the IDT7025 8Kx16 dual-port static RAM, a very useful chip that supports 8 and 16-bit access independently on either of its RAM ports. This is what I used for running tests on my Model 1 and Jaleco Mega System 32 boards, the latter needing two emulators in parallel to emulate four EPROMs.

The software to control it was a modified version of the 8-bit EPROM emulator utility. Support for multiple devices isn't implemented quite right, so just use it for controlling a single device. This ended up being my first project that used almost all surface mount parts, as well as PLCC chips. I'm quite proud of the way it came out, and so far it has been invaluable for running tests on 16-bit and 32-bit systems. Here is the PCB layout, documentation, utility program and source code:

At some point I was thinking of making adapter PCBs that would support 27C4096 and 27C400 type pinouts as I have a few boards with those chips instead of two 8-bit EPROMs in parallel. It's just a wiring difference but this would be a neater solution and simplify cable assembly.

VDP pin assignments

Here is a mostly complete pinout for the 315-5313 VDP.

If anyone does something cool like sticking a RAMDAC chip on the color bus outputs, do let me know. :)

V25 research

There are a number of instructions which delay interrupt and exception processing, allowing one more instruction to be executed before the interrupt is taken:

    POPF, CLI, STI
    POP [segment-register]
    MOV [segment-register], r/m16
    Segment prefixes: CS, DS, ES, SS
    Repeat prefixes: REP, REPNE, REPC, REPNC
    LOCK prefix

For the prefixes, this prevents an interrupt from being taken after the prefix byte has been fetched but before the instruction it applies to has been executed. Likewise for segment register loads, if an interrupt occurred after SS was changed, SP would be invalid. By delaying interrupts the following types of sequences become uninterruptible:

    pop ss
    mov sp, $F800
    ; or
    mov ss, [si+0]
    mov sp, [si+2]
It seems less important to have DS and ES register loads delay interrupts as well, I did not expect this behavior.

I have been looking at the MCU code for other games and it seems that they use similar, if not identical instruction encodings, despite using differently labeled MCUs. V-Five in particular seems to match the Knuckle Bash opcodes quite closely, and when/if I can get Knuckle Bash decrypted, I'll see how much of V-Five can be decrypted.

Old News (4/12)

Fresh start

Getting the website fixed up, most links will not work. There was an exploit in the blog script that was being taken advantage of, so I'm going back to basics. I'll see what I can do about getting the old blog content back here.

Knuckle Bash

I recently acquired a Toaplan "Knuckle Bash" PCB. It's a fairly impressive system, based around a custom graphics chip which displays three tiled background layers and two 512x512 12-bit framebuffers for double buffered sprites. It has a 68HC000 running at 16 MHz that handles all the game related tasks, and a V25S MCU that manages inputs, sound effects, and music playback. The music for this game is quite good and definitely a notch above the rest. A lot of other Toaplan games use the same graphics chip, so I'm intending to run tests on it and get all the timing and other details worked out.

The V25S microcontroller is a 80186 clone manufactured by NEC. Unlike the V25 it has no usable internal ROM and no 8080 emulation mode, the latter of which has been modified to add a new 'secure' operating mode. In secure mode a lookup table translates opcodes fetched from memory with their V25S equivalents. This allows the opcode-to-instruction mapping to be changed as the customer (Toaplan) sees fit, making the program code unusable unless the table contents are known. Luckily operands and data are not encrypted, and examination of the operands such as the ModR/M byte can reveal what category of instructions a particular opcode might fit in to.

NEC intended for the V25S to be used as a drop-in replacement for the V25, to accomplish this it uses one of the unused V25 pins as a mode select input. When tied high or floating (due to an internal pull-up resistor) the CPU runs in normal mode, where the lookup table is bypassed and opcodes are processed normally. When tied low, the CPU is in secure mode and the lookup table is utilized. This pin is sampled during a reset, interrupt, or exception, and bit 15 of the PSW can be modified through select instructions to change the operating mode regardless of the pin state as well. These features allow a V25S to start in normal mode and selectively execute encrypted programs while still interacting with a unencrypted BIOS, operating system, and device drivers, or vice-versa.

I modified the Knuckle Bash board to start the V25S in normal mode, and developed a program that sets the MCU to a known state and enters secure mode with the instruction trap feature enabled. This forces just one encrypted instruction to be executed before control is passed back to my unencrypted code, at which point the potentially modified state of the MCU is saved and examined. The behavior of all encrypted opcodes (except BRKS which sets up an unrecoverable state) can therefore be examined. I can see things like what data was pushed or popped from the stack, which registers were loaded, exchanged, or modified, and which instructions triggered an I/O or floating point exception. A lot of information can be gathered about the encrypted instructions, which narrows down or completely identifies which unencrypted instructions they map to. Best of all this technique should work for any V25S based system, such as the other Toaplan games. I'm looking forward to trying it on my Golden Axe 2 security board to see how effective it is after finishing with Knuckle Bash, though right now it's too early to give any indication of progress.

Toaplan did an excellent job with the protection. The program ROM is filled with valid Z80 code and garbage data to throw off statistical analysis of the ROM, such as observing the frequency of occurance for particular bytes and byte sequences. The MCU has no manufacturer marking and has ambiguous names printed on it like "NITRO" and "DASH". Furthermore, the lookup table maps many opcodes to the same instructions so certain easily identifiable instructions can simply never be executed, increasing the number of potential matches any encrypted instruction might have. If this technique is applicable to the V35S, we'll have to see what Irem did with their games. :)


www.digits.com www.digits.com