View Single Post
Posts: 37 | Thanked: 68 times | Joined on Jun 2015 @ Munich, Germany
#76
Originally Posted by liar View Post
i've been able to turn on more optimation flags (-flto) for some of the mupen64plus modules on the jolla phone, i will release this sometime this weekend
maybe this yields (slight) speed improvements
By just adding -flto you'll gain nothing. LTO needs the gold linker instead of bfd but I have no idea how to switch on SailfishOS (here's a how-to for gentoo: https://wiki.gentoo.org/wiki/Gold ). Also you have to add plugin support to the linker for LTO to work, so also add -fuse-linker-plugin. Last but not least you need to tell cc and the linker to use LTO, so your flags should look like this:
Code:
CFLAGS="-flto -fuse-linker-plugin"
CXXFLAGS="-flto -fuse-linker-plugin"
LDFLAGS="-Wl,-O1 -Wl,--as-needed -flto -fuse-linker-plugin"
Now there's more to optimize than just LTO... This is how I compile almost everything on my desktop machine:
Code:
CFLAGS="-march=native -O2 -pipe -fno-ident -flto=3 -fuse-linker-plugin -ggdb"
CXXFLAGS="${CFLAGS}"
LDFLAGS="-Wl,-O1 -Wl,--as-needed ${CFLAGS}"
What do the flags say?
-march=native = Compile for the processor you're actually running at only.
-O2 = Level 2 optimization - Most sane optimization.
-pipe = Use pipes while compiling. Is a bit faster but needs more RAM (while compiling. Doesn't have any effect on the binarys).
-fno-ident = Don't save ident informatins in the binary. They are never used anyway, so save a bit of space.
-flto=3 = Use LTO. Use 3 threads at max for it. While 3 seems to be good for my desktop it also seems to be good for the Jolla phone as a rule of thumb is CPU cores + 1.
-ggdb = Add gdb debugging information. YOU DON'T WANT THIS!

-Wl,-O1 -Wl,--as-needed = more or less default linking flags. Don't ask me what exactly they do...

Now if you want to optimize even more try this:
Code:
CFLAGS="-march=native -O3 -pipe -fno-ident -flto=3 -fuse-linker-plugin -funroll-loops -ftree-vectorize -ffast-math"
CXXFLAGS="${CFLAGS}"
LDFLAGS="-Wl,-O1 -Wl,--as-needed ${CFLAGS}"
Flag explaination:
-O3 = Optimization level 3 (this is the maximum!) - might create bigger binarys. Might create unstable binarys. Binary might be slower than with -O2, so benchmark, benchmark, benchmark!
-funroll-loops = Unroll a loop when the number of iterations are already known at compile time. From the GCC devs: This option makes code larger, and may or may not make it run faster. ... So again: Benchmark, benchmark, benchmark!
-ftree-vectorize = I'm not sure what this does and if it does anything on arm at all. Anyway, it should help with the next:
-ffast-math = Faster but more unsave math operations. Use with caution!

More information about flags: https://gcc.gnu.org/onlinedocs/gcc-4...e-Options.html / https://wiki.gentoo.org/wiki/GCC_optimization

Last but not least don't forget to give that flags to all compiling stages (at least to ./configure, make and make install). At best export them before doing anything.

Hope that helps.

//EDIT: On a side note you might want to ship this within the rpm:
Code:
<?xml version="1.0" encoding="UTF-8"?>
<mime-info xmlns="http://www.freedesktop.org/standards/shared-mime-info">
  <mime-type type="application/x-n64-rom"> 
         <comment>Nintendo 64 ROM</comment>
         <magic priority="50">
                <match type="string" value="\x37\x80\x40\x12" offset="0" />
                <match type="string" value="\x80\x37\x12\x40" offset="0" />
         </magic>
  </mime-type>
</mime-info>
just place it to /usr/share/mime/packages/ and run
Code:
update-mime-database /usr/share/mime
in your %post

The magic numbers where taken from ROMs I had available. None of them is xdg-openable without this file. Anyhow, my file extensions for the ROMs are f*cked up, so while I'm pretty sure the first magic match is the v64 format I don't know what the second value is. That's one of the reason why I'm using magic numbers instead of file extensions, the other is that there's n64 and N64 and both are different nintendo 64 ROM formats: https://en.wikipedia.org/wiki/List_o...M_file_formats

//EDIT²: Also a small bug: While the app UI is sideways it doesn't tell that to the phone, so swipe from side / swipe from top are exchanged.

//EDIT³: Tested some games:
Banjo & Kazooie: Slideshow
Iggy's Reckin' Balls: Crash (well, in fact mupen64plus is still eating a bit CPU but doesn't do anything noticeable)
Zelda Ocarina of Time: Massive Z-fighting and lightning issues
Legend of Zelda 2 Majora's Mask: Same as Zelda Ocarina of Time + bad performance at the menue (in-game it gets better) + letterboxing issues (wrong color, wrong position)
Yoshis Story: Very bad button layout (Thumb above the character is not good)
Pilotwings 64: Massive rendering issues

General issues:
- Sometimes the image gets flipped by 90°
- Sometimes the image renders small and stretched
- Sometimes both happens at the same time

//EDIT⁴: Now that I think about it: For the Zelda games the wrong letterboxing color could be connected to the lightning issues and the wrong ltterboxing position to the z-fighting. It's just a shot into the blue but maybe worth investigating?

Last edited by V10lator; 2015-07-01 at 01:44.
 

The Following 2 Users Say Thank You to V10lator For This Useful Post: