Wednesday, June 22, 2011

specifying alignment with arm neon instructions and the iOS toolchain

On ARM Cortex CPUs with the SIMD "neon" extension, you can do a bunch of complex loads and stores in a single instruction, like so:

vld4.u8 {d0,d1,d2,d3}, [r0@128]

where "@128" means 128-bit aligned. But the iPhone (iOS) SDK uses gas, the gnu assembler. gas treats '@' as a comment character, so the gas syntax uses a colon instead. Or at least it should. The version of gas distributed in the latest 4.3 SDK is broken, so you can't specify an alignment at all. You'll get an error like:

']' expected -- `vld4.u8 {d0,d1,d2,d3},[r3:64]

This is bad because aligned transfers are faster.

The ARM assembler is located in '/Developer/Platforms/iPhoneOS.platform/Developer/usr/libexec/gcc/darwin/arm'. This is what needs to be fixed.

Wednesday, June 8, 2011

ld: bad codegen, pointer diff in boost::detail::sp_counted_base::sp_counted_base

Trying to build an iOS project with XCode 4:

- boost built as a framework
- a static library
- the iOS app

which gives the above error. I get the same behavior no matter which compiler I select (gcc, gcc-llvm, clang), which makes a little sense since it's a linker error.

A little (lot) of googling finds a lot of suggestions to play with the compiler visibility flags. Some people say enable, others disable. The short answer is, yes, this is how I fixed things. But just messing with symbol visibility without understanding the problem rubs me the wrong way, especially when the oh-so-useful but oh-so-frightening boost is involved.

Thursday, April 21, 2011

XCode 4 and Qt.. fun with build rules

Qt doesn't officially "support" XCode 4 yet, but that doesn't mean it's impossible. Just annoying.

the goal

You can install Qt as a framework from the packages available from Trolltech/Nokia and build code that uses Qt. The problem is that slots and signals (and features like properties) won't work with your classes. That makes it difficult to get callbacks from buttons and other widgets.

Designer is a useful tool for building UIs interactively. It outputs .ui files, which we need to turn into code.

So the goal is to make those two things work in a relatively painless fashion.

a little background

If you aren't familiar with Qt tools like moc and uic (I wasn't before doing this), a short introduction will help.

Qt adds "extensions" like slots and signals to c++. To make them work, the Qt framework needs information at runtime about classes, like what slots they have. The information is stored in QMetaObject objects, which seem to be lists of strings that store, among other things. slot names. The information is generated by the Meta Object Compiler, or moc. When you include the Q_OBJECT macro in a class definition, you run moc over the file to generate a little extra c++ code, usually written to moc_baseFile.cpp. This file is then compiled and linked into your program as normal. Simple.

Designer outputs .ui files which can be compiled by the user interface compiler, or uic. uic takes a .ui file, which is just xml, and outputs some c++ code, usually in ui_baseFile.h. You then #include this into your code and you're good to go.

making it work

I used the build rules feature of XCode to define two rules, one for my source files that need to be moc'ed (the ones that use Q_OBJECT), and one for ui files that need to be uic'ed. This is what I'd like to do:

First, note the '*' in front of the filename to catch the rest of the path. Next, note that the name of the generated file is designed to *not* be something that will match the pattern for this build rule. If it is.. bye-bye, XCode!

In theory, XCode will generate moc_myClass.cpp and compile/link it. Yay! Of course the rule shown above will only work for myClass.h. Creating a rule for each header that uses Q_OBJECT would be annoying, so we use a few build settings/variables:

I'm assuming that you can list multiple expressions for file names to process; I only have one right now and haven't tested it. So this almost works. Except that it doesn't. The rule is never executed.

I'm guessing that headers are not considered to be "source files" and so are not considered for rule processing. So I hacked it. I created soft filesystem links to the headers in question with names like _build_rule_moc_myClass_h.cpp and added them to the project. Then I changed the '.h' to '.cpp' in the build rule. And all of a sudden it works.

on to ui files

ui files are basically the same story. Here is what I'd like to have:

Again, the rule is not executed until I change the name to something.cpp. So I do the same trick with a filesystem link, and it works.

in summary

The setup described here will automatically run moc and uic as necessary. Adding a new file to moc or uic involves creating a soft link to the file with extension 'cpp', adding both files to the project, and in the case of moc, adding the filename to the build rule.

Note that I'm not passing predefined macros or include paths to the moc tool. You may need to do this.

And one final note: I had to manually add paths to the Qt headers to the header search paths build setting. I shouldn't need to; I suspect something is broken/incompatible in the Qt framework package.

notes

I can't find any real documention on the build system. Nothing turns up in the new XCode help system, and links to info on Apple's site are broken. I also can't find anywhere an official description of variables like INPUT_FILE_PATH.

No clue why rules for files with names that end in .h or .ui don't work. If you do, let me know!

a few useful links

Xcode Build System Guide: Introduction
Xcode Build Setting Reference: Introduction
Cocoa with Love: Custom build rules, generated tables and faster floating point

Qt 4.7: Using the Meta-Object Compiler (moc)
Qt 4.7: User Interface Compiler (uic)
Qt 4.7: QMetaObject Class Reference
Qt 4.7: Using a Designer UI File in Your Application

Step into Xcode: Sample Chapter


Sunday, April 17, 2011

CUDA - if I could do it all over again.

I've been doing some CUDA work for a client for almost a year now. An image processing library. This was my first experience with GPGPU type stuff, not counting a few miscellaneous shaders over the years. I figured I would jot down some notes that I wish I had when I started.

First, I wouldn't do CUDA again. A year ago I chose it over OpenCL because OCL was even less mature and widely used than it is now. So why have I changed my mind? Mostly because I'm mad at CUDA. I don't actually know OpenCL to be better, I'm just fed up with CUDA nonsense. But anyway, here's what I've learned..

- use the driver API. Visual Studio doesn't integrate well with .cu files. The debugger barely works, no automatic dependencies. There were lots of silly restrictions when I started (no 16-bit floats, all file-scope variables static) which have since been removed. The driver API seems to be less of a moving target.

- ask questions on the forums. the documentation is poorly written and maintained. there is important information only available in the forums.

- use the "CUDA Templates" project on SourceForge, or something similar. The cuda api is inconsistent and just basically poorly designed.

- if running on Windows Vista/7, be aware that the docs are lying to you. Kernel launches are said to be asynchronous, so in theory you should be able to start a kernel and do something with the CPU in the meantime. Not so simple on Vista/7. Commands sent to the GPU are batched by a queue inside the CUDA drivers. You need to jump through hoops to force a flush before the kernel will actually start. Search the forums, as that's the only place this hidden "feature" is documented.