Another “dreaded problem” accrued this week which I’m still struggling with. In contrast to Java’s simplified math types, in C a lot more possibilities exist to represent floats. This is problematic in particular when you have to map between the different formats.

For Java I have following rule of thumb:

  • Int(eger) for numbers in \mathbb{Z}.
  • Double for Floating Point Arithmetic (On Dalvic VM  I prefer float).
  • Avoid mixing int with double.
  • Map to Number for DB access.

(Honestly, I can’t remember reading something like for(char i = 0; i < 5; i++) in a Java source-code and since there is no pointer-arithmetic necessary, Long – numbers rarely appear for hashes and DB operations. )  

A Java double is represented by 64Bit,  in YAOgl I’m using the following resolutions:

  • For OpenGL 32 Bit (24 Bit Mobile)
  • For 3D-Models 32 Bit
  • For Positioning, Timing and Animation 32/64 Bit

And notably my

  • CPU realtime optimized version of Bullet-Physic-Engine uses 16Bit floating point arithmetic.

This version uses a SIMD (multimedia) instruction-set for the corresponding target platform (the machine code handles multiple calculations at once). Without the optimization a simulation would take to much computation time.

The 16Bit limitation also narrows the range in which a usable simulations can take place. Even worse, bijective-mapping between floating point arithmetic with different resolutions results in nonsense!  (Objects start to jitter or slowly drifting  away. )

If you ever heard of a successful implementation – I would doubt the source. If you combine the different resolutions to a Set R, than

The Set of Numbers (R) doesn’t form a Ring.   R isn’t commutative. Therefore it  hasn’t an associative or distributive property.

As a result it is necessary to  align important/active parts of  a scene-graph to the usable simulation scale and range.

Since I have to handle very small and very large dimensions in B.I.S (Living-room, the play-board, and the tiny shop on Irata), I’m currently working on a  fast rescale function. This includes texture mapping and sound.

Update 12/12/13

It is now possible to rescale the scene-graph during runtime with: [YAWorld* setScale:(float) scale]. Models, textures, lights and the viewport are than updated for the next frame. The 3D-Audio is mapped according to the actual camera position (or better your ears). Therefore no changes are necessary.

What else was done this week

CapsuleCollision

 

 

Here is a short video (sorry for the silly references but the opt-out function seems to be disabled):

 

 

To create a scene-graph that is distributable in realtime (for example, for multiplayer games) the following functionalities must be implemented:

  • A persistent model of the scene-graph.
    An ER or document-model can be stored and distributed by a server-process.
  • A SG Serialization
    To transport a scene-graph from a server to a client it must be de/serializable.
  • Online-Compression
    Besides standard methods to compress data for a fast transport, useful procedures exist to compress data-structures like sparse matrices.
  • Incremental updates
    On a OpenGL context, the complete scene-graph (or several octants) are redrawn for each frame. For a SG synchronization, only the incremental changes must be transmitted to the clients.
  • Implement a fast and secure Internet Protocol.

The first two features are now available in YAOgl. With the latest release of OsX many of the remaining functions are now already provided by system libraries. The development of a dedicated-server is postponed until YAOgl is wrapped for Java.

Visual-Programming/Configuration

Once again, I came in touch with current “visual build systems” for user-interfaces and data modeling. Apart from the well known problems, I observed the following:

  • Learning through imitation: Instead of obtaining knowledge about the problem-space and the development of skills to handle the process-operation,  you simply imitate working steps. You watch demonstration videos (or a teacher) and try to reproduce the necessary steps to create a result. If something went wrong you have not the slightest idea what to do.
  • The Learning process is time-consuming. You have to watch the whole video or presentation. You can’t scan or skip paragraphs and you can’t create bookmarks.

Tips: (Mis)Use your Version Control System

  • Use Git commits as “Undo” function: One modification in the visual-representation can have multiple effects on configurations files. The default undo function is not always reliable.
  • Use Git branches for alternative solutions. You can compare and switch between different dead ends.

 

Da der Import einer Szene aus externen Tools zu Aufwendig ist, habe ich den internen Szenengraph um einen Debug-Modus ergänzt. Zudem konnte durch Optimierung der Vektor-Berechnungen etwa 6% und durch Umstellung der OpenGL Befehlsfolge noch einmal 4% Leistungsverbesserung erzielt werden. Die Berechnung und Darstellung einer Szene benötigt nun weniger Ressourcen als das Abspielen eines Internet-Videos.

Aufbau des mechanischen Orbits:

Hier experimentiere ich mit der Szenen-Beleuchtung:

You can now use interpolation to animate objects and bones. This means that you simply add your target position for a vector or quaternion for a specific point in time. The actual position or rotation is than calculated between the last and next target. Additionally all capabilities from the basic animation class are inherited:

Engine status

The Framework now compiles with the latest LLVM version. The memory management was reimplemented with ARC. I’m actually working on improved methods for user interactions and object modeling.

Using microsofts kinect device for motion tracking

The kinect device was originally developed for gesture recognition in games. Because of its capabilities to record 3D body poses in real-time I checked the possibilities for motion tracking. Therefore I developed a small program written in Scala. My recorded data looks promising but at the moment certain work has to be done in post production to create a professional skeletal animation. There are at the moment much better solutions available like Brekel Kinect.

Using a kinect is very straightforward with the latest OpenNI library. Here is a Scala example:

package de.yousry.motiontracker

import scala.actors.Actor
import org.OpenNI._
import de.yousry.motiontracker.observer._
import scala.collection.mutable.HashMap

object OpenNIHandler extends Actor {

  var quit = false;
  var listeners = List[OpenNIListener]()

  var context: Option[Context] = None
  var depthGen: Option[DepthGenerator] = None
  var userGenerator: Option[UserGenerator] = None
  var skeletonCap: Option[SkeletonCapability] = None

  var width = 0
  var height = 0

  val activeUsers: HashMap[Int, MetaFile] = new HashMap[Int, MetaFile]()

  init

  def act = loopWhile(!quit) {
    react {
      case ReceptorDimensions => reply { ReceptionResult(width, height) } 
      case ConnectOpenNI(listener) => listeners ::= listener
      case DisconnectOpenNI(listener) => listeners = listeners.filter(x => x != listener)
      case Quit => {
        println("Quit OpenNI handler")
        context.get.stopGeneratingAll
        context.get.release

        activeUsers.values.foreach(_.out.close)

        quit = true
      }
      case Loop => loop
    }
  }

  def init = {
    println("Initializing Openni")
    try {
      val SAMPLE_XML_FILE = "../../data/OpenNIConfig.xml"
      val scriptNode = new OutArg[ScriptNode]();
      context = Some(Context.createFromXmlFile(SAMPLE_XML_FILE, scriptNode))

      depthGen = Some(DepthGenerator.create(context.get))
      val depthMD = depthGen.get.getMetaData()
      width = depthMD.getFullXRes()
      height = depthMD.getFullYRes()

      userGenerator = Some(UserGenerator.create(context.get))
      skeletonCap = Some(userGenerator.get.getSkeletonCapability())
      val poseDetectionCap = userGenerator.get.getPoseDetectionCapability();

      val calibPose = skeletonCap.get.getSkeletonCalibrationPose()

      userGenerator.get.getNewUserEvent().addObserver(new NewUserObserver(skeletonCap.get, poseDetectionCap, calibPose));

      userGenerator.get.getLostUserEvent().addObserver(new LostUserObserver(activeUsers));
      skeletonCap.get.getCalibrationCompleteEvent().addObserver(new CalibrationCompleteObserver(calibPose, activeUsers, skeletonCap.get, poseDetectionCap))

      poseDetectionCap.getPoseDetectedEvent().addObserver(new PoseDetectedObserver(poseDetectionCap, skeletonCap.get));

      skeletonCap.get.setSkeletonProfile(SkeletonProfile.ALL);

      context.get.startGeneratingAll
    } catch {
      case ex: Exception => println("Could not read OpenNI Config:  " + ex.getCause()); sys.exit
    }

    OpenNIHandler ! Loop
  }

  def loop = {

    if (!listeners.isEmpty) {
      try {
        context.get.waitAnyUpdateAll

        val depthMD = depthGen.get.getMetaData()
        val sceneMD = userGenerator.get.getUserPixels(0)

        val oniUsers = userGenerator.get.getUsers.toList.par
          .filter(Id => skeletonCap.get.isSkeletonTracking(Id))
          .map(userId => new OniUser(userId, skeletonCap.get, depthGen.get, activeUsers.get(userId).get))

        listeners.par.foreach(_.updateNI(depthMD, sceneMD, oniUsers.toList))

      } catch {
        case ex: Exception => println("OpenNI Hanlder loop: " + ex.getMessage())
      }
    } else {
      Thread.sleep(100)
    }

    OpenNIHandler ! Loop
  }

}

trait OpenNIListener {
  def updateNI(depthMD: DepthMetaData, sceneMD: SceneMetaData, users: List[OniUser])
}

case class ConnectOpenNI(listener: OpenNIListener)
case class DisconnectOpenNI(listener: OpenNIListener)
case object Loop
case object Quit
case object ReceptorDimensions
case class ReceptionResult(width: Int, height: Int)
case class MetaFile(userId: Int, startTime: Long, out: java.io.FileWriter)

You can now optionally add skeletal-data to your model. Bones (represented as matrices in modelspace) and Joints (as axis with head pitch and roll) are handled in vertex and geometry units of the GPU.

NSString *modelFile = @"armatureProbe";
NSString *ingredientName = @"probe";
YAIngredient *ingredient = [world createIngredient:ingredientName];
[ingredient setModel:modelFile];
[world addShapeShifter:modelFile];

The shape-shifter template will than connect itself to every instantiation of the inhered model, creating a copy of the default pose.
For example to access your “shape-shifting” data as joints use:

NSMutableDictionary* joints = [probeImp joints];
YAShaper* upperBone = [joints objectForKey:@"upperBone"];

Movements can be created (as usual) by animation-objects or with a json stream (for example from kinect or openSim)

    {
      "Name": "upperBone",
      "Id": 1,
      "Parent": 0,
      "Joint": [ 0.000000, -0.000000,  1.015570 ],
      "Quaternion": [ 0.000000,  0.000000,  0.000000,  1.000000],
      "Bone": [ 1.000000,  0.000000,  0.000000,  0.000000,
                0.000000,  1.000000,  0.000000,  0.000000,
                0.000000,  0.000000,  1.000000,  0.000000,
                0.000000,  0.000000, -0.000000,  1.000000
      ]
    }

Here is an unrelaistic example from my unit tests where the pitch and roll of a joint are simultaneously modified.

YAOGL realtime shadows
YAOGL Realtime shadows

The principle is simple. An additional camera is used to render the scene to a new frame buffer object. The task of this fbo is to calculate the depth information from an imaginary light source to all shadow casters. The depth information in the fbo can be used to decide if a fragment (pixel or part of a pixel) is shadowed or not.

Complicated is the configuration of the OGL 3 core state-machine. Most documents about this subject are outdated or simply wrong. Also the OpenGL specification is interpreted differently among the hardware vendors.

The benefits of the efforts is a burning fast 3d graphics running on your mac, pad or phone.

There are several advantages to use a GPU for processor-intensives tasks.

  • The GPU is faster than your CPU.
  • The GPU isn’t interrupted by background tasks such as the Java VM  garbage collector.

AMDs APARAPI library is an approach to run your Java code on a GPU. You can write your code in plain Java. Although there are several restrictions for your data-types, it is much easier to use than “pure” OpenCL.

I present here my attempt to move parts of my calculations for real-time audio processing to the GPU. My ongoing project “Sprachmodul” is written in Scala. I’m using the Simple Buid Tool for building.

 

Step 1: Tell SBT to use OpenCL and Aperapi

Find your project class and add following line:

override def fork =  forkRun("-Djava.library.path=/opt/aparapi-Linux-amd64;/opt/ati-stream-sdk-v2.3-lnx64"  :: Nil)

I installed AMDs Stream SDK and Aperapi in /opt/..

Step 2: Create a Kernel

Aperapi doesn’t support Scala. Fortunately it is possible to mix Java and Scala sources in your project. Simply create a java class:

public class PitchShifterKernel extends Kernel {

    @Override
    public void run() {
    }
}

Step 3: Define the part to run on the GPU

I had the original DFFT paper and the C code available. The difference between the C and Java syntax is minimal.

@Override
	public void run() {
		for (i = 2; i < 2 * fftFrameSize - 2; i += 2) {
			for (bitm = 2, j = 0; bitm < 2 * fftFrameSize; bitm <<= 1) {
				if ((i & bitm) != 0)
					j++;
				j <<= 1;
			}
...

Step 4: Replace all Java-Math library calls with the corresponding gpu calls

For example:

Math.log(fftFrameSize) / Math.log(2.)

is rewritten to:

log(fftFrameSize) / log(2.)

The mathematical functions are part of the Kernel. You could also write “this.log()”.

Also noteworthy is, that you cannot call the kernel directly from Scala. You have to wrap it in Java:

pitchShifterKernel.execute(1);

The “global number” can be used to identify the caller/process.

 

The Result:

Here is a small “audio pitch” demonstration of my application.

 

Alternatives

All solutions for Java / GPU programming are in an early Beta or Alpha stage. Most of the result are disappointing.

ScalaCL:

After trying to run the first demo-application, ScalaCL crashed immediately. Without a detailed error message further works are impossible:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f8021806ff0, pid=3489, tid=140188284131072
#
# JRE version: 6.0_24-b07
# Java VM: Java HotSpot(TM) 64-Bit Server VM (19.1-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x779ff0]
#
# An error report file with more information is saved as:
# /home/yousry/hs_err_pid3489.log
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
#
Abgebrochen

Jogl/JoCL

Trying to run a demo from the latest JogAmp developer-builds generated neither a output nor a error message:

Info: XInitThreads() called for concurrent Thread support

The application just holds in an infinite loop.

An Example for real-time audio analyzing, shifting and tuning.

Many algorithms for audio analysis and manipulations are based on Fast-Fourier-Transformations (FFT). These transformations disassemble an input signal into sinusoids which, in turn, can easily be evaluated and modified by trigonometric functions.

Audio applications typically consists of a simple main loop:

loop (
    read source 
    analyze/modify data(FFT)
    write to destination (Speaker or File)
)

The advantage of this solution is a reduced risk of audio lagging or stuttering.
Disadvantages are an entanglement between the different tasks  and missing concurrency.

Therefore the source-code gets difficult to understand and multi-core processors are not utilized.

With Scala’s Actors library, a simple way exists to disentangle these tasks and build concurrent programs. Instead of byte buffers, messages and mailboxes are used to transport and manipulate the audio data.

The loop is replaced by three actors:

  • The Receptor reads a chunk of  data from the selected target (In this case a microphone or a sound-file) and sends the normalized transformation to the Manipulator actor.
  • The Manipulator analyses and modifies the data.
  • The Vocal-Chord actor calculates samples with the selected frame-rate and resolution from the received data.

An Actor has the following structure:

class VocalChord extends Actor {
  def act = loopWhile(!quit) {
    react {
      case data => playback data
      case quit => quit = true
    }
  }
}

An Actor checks its mailbox for new messages, retrieves the audio data  and sends the result to the next Actor.

The User Interface is build with swing and uses its own thread. SwingWorkers are used to publish interim results (messages) from the actors.

The result is a very responsive real-time Application. In my experiments the CPU load did not grow above 40%

Examples:

Sample form the Speech Accent Archive (released under creative commons)

Original: german4 (22050Hz 128kbps Mono)
Pitch Shift: german4PitchShift
Tuned: german4-Tune443HZ

Last week I was investigating a  NP-hard problem  (L := Finding a root cause of an incident,  SAT ≤p L) and created an instance with a search space size of 2658455991569831745807614120560689152 (2.658455992×10³⁶) elements. The heuristic (randomization and local optimization) ran for several days and it found 2 solutions. Of course every solution was a highlight and I was looking for way to be informed as soon as possible.

Like Growl for Mac and Snarl for Windows, Ubuntu is using NotifyOSD as notification system.

Fortunately there is a java binding available.  You can get it with: apt-get install libjava-gnome-java

After adding the library: gtk.jar to your classpath you can use it from scala:

def main(args : Array[String]) : Unit = {

  	import org.gnome.gtk._
  	import org.gnome.notify._

  	Gtk.init(args)
  	Notify.init("Schnacken") //  Schnacken(de) = babble(en)

  	val icon = new StatusIcon {setFromIconName("tomboy")}
  	val notification = new Notification("Call NotifyOSD", "from Scala!","tomboy", icon)

  	notification.connect(new Notification.Closed() {
  		override def onClosed( source: Notification) {
  			Notify.uninit
  			Gtk.mainQuit
  		}
  	})

  	notification.setUrgency(Urgency.NORMAL)
  	notification show
  }

As desired a notification bubble is created: