Posts Tagged ‘SSE’

Useless Snippet #4: Basic trigonometry (sin/cos)

Thursday, December 15th, 2011

Goal: Calculate the sin/cos of an arbitrary angle.
Restrictions:

  1. Should be faster than CRT’s sin/cos and faster than FPU’s fsin/fcos
  2. The error should be kept to a minimum compared to the above functions
  3. Double precision is required
  4. Function should be in the form: double func(double), ie. no xmm regs passed to the function, result should be returned on the FPU stack and it should calculate only 1 angle at a time

For the rest of the post we’ll be talking about sin(). Calculating cos(), or even sincos() should be easily derived from the code below. The required changes will be described at the end of the post.
(more…)

Useless Snippet #3: AABB/Frustum test

Thursday, August 4th, 2011

Goal: Classify whether a batch of AABBs are completely inside, completely outside or intersecting a frustum (6 planes).
Restrictions:

  1. AABBs are defined as (Center, Extent) pairs.
  2. All vectors are Vector3f’s

(more…)

Useless Snippet #2: AABB from a point list

Wednesday, July 27th, 2011

Goal: Calculate the Axis Aligned Bounding Box of a point list.
Restrictions:

  1. Vertices as Vector3f’s
  2. (Optional) Vertex list should be 16-byte aligned

(more…)

Useless Snippet #1: Transform Vec3f by Matrix4x4f

Saturday, July 23rd, 2011

Goal: Multiply a batch of Vector3f’s with the same 4×4 matrix.
Restrictions:

  1. ‘src’ and ‘dst’ arrays shouldn’t point to the same memory location
  2. All pointers should be 16-byte aligned (see below for details on array sizes)
  3. Treat Vector3f’s as positions (w = 1.0)
  4. Matrix is column major

(more…)