Sunday, November 25, 2012

I spent the last couple of days working on libgcc, and got all missing functions done except the division and modulus stuff.

Once this is done, I need to review everything and look for some bugs to fix. I know there's a problem with register counts for the function prologue and epilogoue, so that should be interesting.

Wednesday, November 21, 2012

So for the past few days I've been working on libgcc, making sure the compiler covers all instructions up to 32-bit operations.

Missing operations:

Count leading zero bits
__ctzsi2,__ctzhi2, __ctzqi2

Count trailing zero bits
__clzsi2,__clzhi2, __clzqi2

Find index of least significant bit
__ffssi2,__ffshi2, __ffsqi2

Return one if an even number of bits set
__paritysi2, __parityhi2, __parityqi2

Return number of set bits
__popcountsi2, __popcounthi2, __popcountqi

Signed division of 32-bit values

Unsigned division of 32-bit values

Calculate modulus of 32-bit values

Calculate unsigned modulus of 32-bit values

Do both division and modulus calculations

Do both unsigned division and modulus calculations

Multiply 32-bit values

For now, the trapped arithmetic instructions will be implemented using the default code. These functions call "abort" when there is an overflow condition, and are only needed in rare cases. The TMS9900 updates an overflow flag which we can use for this, but we can do that work later.

The other routines in libgcc are for floating-point and fixed-point math, odds are they are too big to really use. But again, I can fix that later.

Wednesday, November 7, 2012

I was trying to build libgcc to make the missing functions like __udivsi3 mentioned earlier. Unfortunately, there's a bug in GCC which causes the libgcc build to fail:

eric@compaq:~/dev/tios/toolchain/gcc-4.4.0/libgcc$ make
Makefile:143: ../.././gcc/libgcc.mvars: No such file or directory
make: *** No rule to make target `../.././gcc/libgcc.mvars'.  Stop.

After looking into this a bit, it tuns out my build directions were lacking. The GNU people always do their builds from a seperate directory, and say that any problems arising from building in the source directory will not be fixed.

So, rather than fight the world on this, I'm changing the build instructions. The following are to be done from the top level of the GCC source directory.

  $ mkdir build
  $ cd build
  $ ../configure --prefix /home/eric/dev/tios/toolchain --target=tms9900 --enable-languages=c
  $ make all-gcc
  $ make install
  $ mkdir libgcc/build
  $ cd libgcc/build
  $ ../../../libgcc/configure --prefix /home/eric/dev/tios/toolchain/ --host=tms9900
  $ make
  $ make install

Sunday, November 4, 2012

After being super busy at work for the last few months, I finally got some time to work on TI stuff again. I was in the middle of something earlier, but heck if I can remember what that was.

One of the people on AtariAge wanted to know if a FAT library would compile OK for the TI. Sounds like a good question, let's find out.

The first pass failed due to missing the libc headers for TI. For now, I'm using the i386 headers. As long as I don't try to link anything, it should be fine.

The second pass failed too, there is an invalid instruction being generated. This was super tedious to track down.

    inv  r2
    neg  >4    * invalid, only registers my be used
    inc  r2

(insn 418 417 784 fat_access.c:167 (set (reg:SI 2 r2 [230])
        (neg:SI (reg:SI 2 r2 [230]))) 68 {negsi2} (nil))

Once the instruction was isolated, the problem was obvious. I was overrunning an RTX array in the NEGSI2 instruction and the junk found there showed up in the final output. By using a properly-sized array for this instruction, all is well.

  [Nr] Name              Type            Size(Hex)  Size(Dec)
  [ 1] .text             PROGBITS        00005462 = 21602
  [ 3] .data             PROGBITS        00000030 = 48
  [ 4] .bss              NOBITS          00000CCA = 3274

Of course, this is missing a lot of other stuff:

So, yes, this library is compilable for the TI, but needs about 24KB for everything. Probably impractical.

I've seen this a few places in the output. It would be easy to optimize this, but this may be a rarely-used pattern, and not worth special attention.
        mov  r2, *r1
        mov  r2, @>2(r1)

This could better be written as:
    mov  r2, *r1+
    mov  r2, *r1

As an unrelated note, I thought for a minute that using the X instruction might be useful for the function prologue and epilogue, since we're iterating though registers. Unfortunately, it's not. At most, there are four instructions for either 'logue, and a lot more would be required if X were used. Never mind.