Notices


Reply
Thread Tools
Posts: 3,074 | Thanked: 12,960 times | Joined on Mar 2010 @ Sofia,Bulgaria
#41
Originally Posted by gidzzz View Post
I used the flags that you advised, no change. However...

I tried with GCC 4.6.2 and it worked flawlessly. What are the results? Size of the executable decreased from 2814 KiB to 2232 KiB, but there was no significant FPS gain in my test cases. At 850 MHz it looks approximately like this:

[Scene]: [GCC 4.2.1] -> [GCC 4.6.2 Thumb]
Tutorial: 30.5 FPS -> 31 FPS
Medium battle: 5.5-7.5 FPS -> 6.5-8 FPS
Large battle: 3.2 FPS -> 3.5 FPS


I have begun porting the game to unwrapped GLES, to see if it works any faster, but it did not bring any great improvements in terms of FPS so far (but it's still far from complete). Nevertheless, it wasn't a waste of time, as I fixed two graphical bugs on the way. I'm especially happy that ion beams don't look so lame anymore.

I updated the first post so that it links to the new version. The changes are:
  • Prettier ion cannons
  • Fix for engine trails sometimes not appearing
  • Framerate is written to the terminal
And there's the Thumb executable too!
Could you share (or point me to) the source code, I want to look at the build scripts. Also 4.7.2 linaro is way better than 4.6.1 in optimizing ARM code, I suspect your results are a combination of wrong compiler arch flags and 4.6.1 gcc

EDIT:
It is definitely your compiler flags are wrong for some reason, I tested a bit, and there is no need to call the kernel for atomic 64 bit operations when gcc compiles for armv7-a:

Code:
echo "void f(){volatile long long a=123;__sync_val_compare_and_swap(&a,3,4);}" | gcc -O2 -mthumb -mfloat-abi=softfp -x c -Wall -dA -S - -o -
results in:

Code:
        .syntax unified
        .arch armv7-a
        .eabi_attribute 27, 3   @ Tag_ABI_HardFP_use
        .fpu neon
        .eabi_attribute 20, 1   @ Tag_ABI_FP_denormal
        .eabi_attribute 21, 1   @ Tag_ABI_FP_exceptions
        .eabi_attribute 23, 3   @ Tag_ABI_FP_number_model
        .eabi_attribute 24, 1   @ Tag_ABI_align8_needed
        .eabi_attribute 25, 1   @ Tag_ABI_align8_preserved
        .eabi_attribute 26, 2   @ Tag_ABI_enum_size
        .eabi_attribute 30, 2   @ Tag_ABI_optimization_goals
        .eabi_attribute 34, 1   @ Tag_CPU_unaligned_access
        .eabi_attribute 18, 4   @ Tag_ABI_PCS_wchar_t
        .file   ""
        .text
        .align  2
        .global f
        .thumb
        .thumb_func
        .type   f, %function
f:
        @ args = 0, pretend = 0, frame = 8
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
@ BLOCK 2 freq:10000 seq:0
@ PRED: ENTRY [100.0%]  (fallthru)
        push    {r4, r5}
        fldd    d16, .L5        @ int
        sub     sp, sp, #8
        add     r4, sp, #8
        fstmdbd r4!, {d16}      @ int
        dmb     sy
        movs    r2, #4
@ SUCC: 3 [100.0%]  (fallthru,can_fallthru)
        movs    r3, #0
@ BLOCK 3 freq:10000 seq:1
@ PRED: 2 [100.0%]  (fallthru,can_fallthru) 4 [1.0%]  (dfs_back,can_fallthru)
.L2:
        ldrexd  r0, r1, [r4]
        cmp     r1, #0
        it eq
        cmpeq   r0, #3
@ SUCC: 5 [1.0%]  (can_fallthru,loop_exit) 4 [99.0%]  (fallthru,can_fallthru)
        bne     .L3
@ BLOCK 4 freq:9901 seq:2
@ PRED: 3 [99.0%]  (fallthru,can_fallthru)
        strexd  r5, r2, r3, [r4]
        cmp     r5, #0
@ SUCC: 3 [1.0%]  (dfs_back,can_fallthru) 5 [99.0%]  (fallthru,can_fallthru,loop_exit)
        bne     .L2
@ BLOCK 5 freq:9902 seq:3
@ PRED: 3 [1.0%]  (can_fallthru,loop_exit) 4 [99.0%]  (fallthru,can_fallthru,loop_exit)
.L3:
        dmb     sy
@ SUCC: EXIT [100.0%]
        add     sp, sp, #8
        pop     {r4, r5}
        bx      lr
.L6:
        .align  3
.L5:
        .word   123
        .word   0
        .size   f, .-f
        .ident  "GCC: (Linaro GCC 4.7-2012.07) 4.7.2 20120701 (prerelease)"
        .section        .note.GNU-stack,"",%progbits
One can clearly see ldrexd/strexd instructions emitted and no libgcc wrappers

I did readelf -a on your thumb binary and it looks ok, excluding that used FPU is vfp3 instead of neon:

Code:
Attribute Section: aeabi
File Attributes
  Tag_CPU_name: "CORTEX-A8"
  Tag_CPU_arch: v7
  Tag_CPU_arch_profile: Application
  Tag_ARM_ISA_use: Yes
  Tag_THUMB_ISA_use: Thumb-2
  Tag_FP_arch: VFPv3
  Tag_ABI_PCS_wchar_t: 4
  Tag_ABI_FP_denormal: Needed
  Tag_ABI_FP_exceptions: Needed
  Tag_ABI_FP_number_model: IEEE 754
  Tag_ABI_align_needed: 8-byte
  Tag_ABI_enum_size: int
  Tag_ABI_HardFP_use: SP and DP
  Tag_Virtualization_use: TrustZone
Now, seems like there is a bug in some of the low-level routines (source-code/Makefie wise) which does not detect/use the correct arch when compiled with gcc 4.7.2
__________________
Never fear. I is here.

720p video support on N900,SmartReflex on N900,Keyboard and mouse support on N900
Nothing is impossible - Stable thumb2 on n900

Community SSU developer
kernel-power developer and maintainer


Last edited by freemangordon; 2012-12-02 at 17:19.
 

The Following 4 Users Say Thank You to freemangordon For This Useful Post:
gidzzz's Avatar
Posts: 282 | Thanked: 2,387 times | Joined on Sep 2011
#42
Thanks to freemangordon for helping me out, I was able to get it to work with GCC 4.7.2. It shoved off 130 KiB from the binary and appears to give framerates better by a few percent in some cases, but no miracles. I updated the link in the first post to point to the new Thumb build.
__________________
My Thumb mini-repository: http://gidzzz.mooo.com/maemo/.
 

The Following 5 Users Say Thank You to gidzzz For This Useful Post:
Estel's Avatar
Posts: 5,028 | Thanked: 8,613 times | Joined on Mar 2011
#43
Hey gdizzz, I've just tried to download latest thumb version from first post, but link expired. Could you re-upload it? Also, if you're interested, I can host them on my ftp, so no need for pesky upload services. Just drop me a PM everytime when you got a new version to host, and you'll have it hosted in minutes.

It's nice to hear, that you were able to do some optimizations! What about that 7%-eating equalizer, is it gone in latest? I'm quite impressed, that you're rewriting code to avoid wrapper usage, too - holding my thumbs (nomen, omen) for it.

/Estel
__________________
N900's aluminum backcover / body replacement
-
N900's HDMI-Out
-
Camera cover MOD
-
Measure battery's real capacity on-device
-
TrueCrypt 7.1 | ereswap | bnf
-
Hardware's mods research is costly. To support my work, please consider donating. Thank You!
 

The Following 2 Users Say Thank You to Estel For This Useful Post:
gidzzz's Avatar
Posts: 282 | Thanked: 2,387 times | Joined on Sep 2011
#44
Originally Posted by Estel View Post
Hey gdizzz, I've just tried to download latest thumb version from first post, but link expired. Could you re-upload it? Also, if you're interested, I can host them on my ftp, so no need for pesky upload services. Just drop me a PM everytime when you got a new version to host, and you'll have it hosted in minutes.
You should be able to download the files now. If you provide me the links to your server, I'll add them to the first post.

Originally Posted by Estel View Post
What about that 7%-eating equalizer, is it gone in latest?
Nope, it's still there. I have realized that is also used for some sound effects, so removing it without crippling the game a bit might give lower savings than expected before.
__________________
My Thumb mini-repository: http://gidzzz.mooo.com/maemo/.
 

The Following 2 Users Say Thank You to gidzzz For This Useful Post:
Estel's Avatar
Posts: 5,028 | Thanked: 8,613 times | Joined on Mar 2011
#45
Here they are:
http://lorienart.pl/homeworld-thumb472_20121202-1.zip for thumb and...
http://lorienart.pl/homeworld-arm421_20121202-1.zip regular one.

Why no repos, BTW? Anyway, thanks a lot for re-uploading, and for your work on this. I hope you haven't abandoned it, and I'm waiting with anticipation for next releases

/Estel
__________________
N900's aluminum backcover / body replacement
-
N900's HDMI-Out
-
Camera cover MOD
-
Measure battery's real capacity on-device
-
TrueCrypt 7.1 | ereswap | bnf
-
Hardware's mods research is costly. To support my work, please consider donating. Thank You!

Last edited by Estel; 2013-03-21 at 21:32.
 

The Following 2 Users Say Thank You to Estel For This Useful Post:
gidzzz's Avatar
Posts: 282 | Thanked: 2,387 times | Joined on Sep 2011
#46
Thank you, I've just updated the links.

Originally Posted by Estel View Post
Why no repos, BTW?
Installing is as easy as dropping the executable into Homeworld folder, which then can be placed wherever you like -- I don't think it can get any simplier than that. But yeah, there's no icon in the menu and one has to browse the forums to discover existence of this port.

I might prepare a package if I find the time, but before that happens I want to figure out the best way to allow placing Homeworld folder anywhere in the fileystem (a symlink pointing to MyDocs/Homeworld by default?).

Originally Posted by Estel View Post
I hope you haven't abandoned it, and I'm waiting with anticipation for next releases
Me too. It's just that I usually work on whichever project I find the most useful at that moment, unfortunately N900's battery life often puts Homeworld at the end of my list.
__________________
My Thumb mini-repository: http://gidzzz.mooo.com/maemo/.
 

The Following 2 Users Say Thank You to gidzzz For This Useful Post:
Estel's Avatar
Posts: 5,028 | Thanked: 8,613 times | Joined on Mar 2011
#47
You should have get a dual polarcell with modified mugen cover - my new assembled one is ~3400 mAh now, and bigger cover makes it much more comfortable to use keyboard If you're interested, I could make it (mugen cover modified to contain lens cover and assembled dual battery) for free including postage, if you can cover raw materials cost (batteries, vanilla cover) - as a way of showing my respect for your work on Maemo.

Sadly, my financial situation doesn't allow me donate money myself, at the moment, but I could donate my work the way I'm capable of - it's the least I can do for such friendly and talented member of Community as you.
---

Originally Posted by gidzzz View Post
and one has to browse the forums to discover existence of this port.
Yea, it's main reason why I've mentioned it - many people discover things by browsing repositories.

Originally Posted by gidzzz View Post
before that happens I want to figure out the best way to allow placing Homeworld folder anywhere in the fileystem (a symlink pointing to MyDocs/Homeworld by default?)
Yes, I'm doing exactly this by default - /home/user/Games/Homeworld folder with homeworld binary there, and symlinks to Homeworld's data files located in /home/user/MyDocs/Homeworld, where real fat data files are located. Then I've prepareed small .sh script that "cd /home/user/Games/Homeworld ; homeworld", and that script sits in /usr/bin/, + is pointed to by .desktop file, for convenience. Quite complicated, I know, but it's the way I've set it up the first time, and it worksforme, so haven't had reasons to change it.

/Estel
__________________
N900's aluminum backcover / body replacement
-
N900's HDMI-Out
-
Camera cover MOD
-
Measure battery's real capacity on-device
-
TrueCrypt 7.1 | ereswap | bnf
-
Hardware's mods research is costly. To support my work, please consider donating. Thank You!
 

The Following User Says Thank You to Estel For This Useful Post:
Posts: 5 | Thanked: 0 times | Joined on Jul 2011
#48
First of all, big thanks for this port.
I'm having two problems tho.
First one is, when i launch the game it says i need libGLES_CM.so,
by checking app manager i noticed that psx emu has libGLES so i installed it which solved it. When i launch the game the max FPS i can get at 900mhz is 4-8, maybe the libGLES is a wrong one ? any possible solution ?
 
Estel's Avatar
Posts: 5,028 | Thanked: 8,613 times | Joined on Mar 2011
#49
gdizzz, according to freemangordon, GLES1 isn't supported hardware'ishly by our GPU (only GLES2 is) - it is emulated completely on CPU.

Could it mean that, after all, my first impression that we're going on sole CPU power, was right? Would it explain why kernel "eats" most CPU power, when homeworld is running (GLES1 emulation)?

If all of the above isn't plain BS, how doable would be HomeworldSDL using GLES2, if at all? Sorry, if this is a question that leaves just sad smile - all those GLES<whatever_number> things are confusing a hell outta' me.

/Estel
__________________
N900's aluminum backcover / body replacement
-
N900's HDMI-Out
-
Camera cover MOD
-
Measure battery's real capacity on-device
-
TrueCrypt 7.1 | ereswap | bnf
-
Hardware's mods research is costly. To support my work, please consider donating. Thank You!
 
Posts: 3,074 | Thanked: 12,960 times | Joined on Mar 2010 @ Sofia,Bulgaria
#50
Originally Posted by Estel View Post
gdizzz, according to freemangordon, GLES1 isn't supported hardware'ishly by our GPU (only GLES2 is) - it is emulated completely on CPU.
Well, it *could* be using some GPU acceleration, but it is still just an emulation

Could it mean that, after all, my first impression that we're going on sole CPU power, was right? Would it explain why kernel "eats" most CPU power, when homeworld is running (GLES1 emulation)?

If all of the above isn't plain BS, how doable would be HomeworldSDL using GLES2, if at all? Sorry, if this is a question that leaves just sad smile - all those GLES<whatever_number> things are confusing a hell outta' me.

/Estel
I guess that would mean a major re-write, GLES1 uses fixed pipeline, while GLES2 not (everything is done with shaders, even a simple triangle to put it simple)

EDIT:

BTW before rewriting, I guess it makes sense to have oprofile for homeworld: http://wiki.maemo.org/Documentation/...aemo5/oprofile
__________________
Never fear. I is here.

720p video support on N900,SmartReflex on N900,Keyboard and mouse support on N900
Nothing is impossible - Stable thumb2 on n900

Community SSU developer
kernel-power developer and maintainer

 

The Following User Says Thank You to freemangordon For This Useful Post:
Reply

Tags
homeworld, homeworldsdl


 
Forum Jump


All times are GMT. The time now is 14:04.