I don't know how much you guys are familiar with SSE programming but we can't use m128, m128d and m128i types like normal int or float types and you must load an array of float, double and ... to them.Their usage is something like this :
Code:
#include <xmmintrin.h>
//emmintrin for SSE2 and pmmintrin for SSE3
//m128 types can store 4xfloats and m128d can store 2xdoubles
//m128i and m128d are available in SSE2, SSE3 and SSE4.x
//and they must be 16bit aligned or multiply of 8
float num1[4] __attribute__((aligned(16)))={100, 2, 4.45, 99.66};
float num2[4] __attribute__((aligned(16)))={0.11, 88.31, 7.88, 19.33};
//Now you must load them into the m128 types
__m128 a,b,c;
a=_mm_load_ps(num1);
b=_mm_load_ps(num2);
//Now you can perform operations on them
c=_mm_div_ps(a,b);
//At last you must store them into normal arrays
float out[4];
_mm_store_ps(out, c);
for(int i=0; i<4; i++)
cout<<out[i]<<endl;
BTW, the new releases will be available at next 2 weeks or a bit later after some redesigning and code profilings [Don't worry I'll do my best to release them as soon as possible
]