Parallel isdigit( )
This is a C/C++ topic, dear reader, so if that's not your thing, you may want to ditch now. (I really need to move to MoveableType or TypePad and/or Blogger needs to get category support).
I've been playing around with some parsing in my free time. I'm not sure what's going on, but I've found Microsoft's string-to-float conversions (atof, strtod, fscanf, etc) abysmally slow--at least on my system with my version of their C++ compiler.
In the process of investigation, the thought crossed my mind of creating my own implementation of isdigit( ). I decided to do a little more work than isdigit( ) with a function I titled CharToDec( ). It returns the values 0-9 for ASCII chars '0'-'9' and 0xFF otherwise.
The numeric value for ASCII '0' is 0x30, '1' is 0x31, '2' is 0x32 and so on.
Here's my approach...
1. Check (char AND 0xF0) for a match with 0x30. If no match, return 0xFF.
2. AND char with 0x0F. At this point, the possible values are 0-15.
3. Check ((char+6) AND 0xf0) to see if number is greater than 15. If so, return 0xFF.
4. Return char.
int CharToDec(int ch)
{
That should seem like a roundabout way of doing it, since all one really needs to do is check if the value is in the range 0x30 - 0x39 and subtract 0x30. The thing is, this one can be easily parallelized to do multiple chars at once. For example, here's a version that can do 8 chars at a time and should work nicely on 64-bit systems. A 32-bit version should be straightforward. I'm not sure if I'll find much use for it, but it could come in handy parsing strictly formatted text files.
unsigned _int64 CharToDec8X(const char* sz)
{
Standard disclaimers apply. No guarantees this code is good for anything. Also, I don't normally format my code as listed, but I am having trouble getting Blogger to indent properly.
I've been playing around with some parsing in my free time. I'm not sure what's going on, but I've found Microsoft's string-to-float conversions (atof, strtod, fscanf, etc) abysmally slow--at least on my system with my version of their C++ compiler.
In the process of investigation, the thought crossed my mind of creating my own implementation of isdigit( ). I decided to do a little more work than isdigit( ) with a function I titled CharToDec( ). It returns the values 0-9 for ASCII chars '0'-'9' and 0xFF otherwise.
The numeric value for ASCII '0' is 0x30, '1' is 0x31, '2' is 0x32 and so on.
Here's my approach...
1. Check (char AND 0xF0) for a match with 0x30. If no match, return 0xFF.
2. AND char with 0x0F. At this point, the possible values are 0-15.
3. Check ((char+6) AND 0xf0) to see if number is greater than 15. If so, return 0xFF.
4. Return char.
int CharToDec(int ch)
{
if ((ch & 0xf0)!=0x30) return 0xff;}
ch &= 0x0f;
if ((ch+6) & 0xf0) return 0xff;
return ch;
That should seem like a roundabout way of doing it, since all one really needs to do is check if the value is in the range 0x30 - 0x39 and subtract 0x30. The thing is, this one can be easily parallelized to do multiple chars at once. For example, here's a version that can do 8 chars at a time and should work nicely on 64-bit systems. A 32-bit version should be straightforward. I'm not sure if I'll find much use for it, but it could come in handy parsing strictly formatted text files.
unsigned _int64 CharToDec8X(const char* sz)
{
unsigned _int64 u = *((unsigned _int64 *) sz);}
if ((u & 0xf0f0f0f0f0f0f0f0L) != 0x3030303030303030L) return 0xffffffffffffffffL;
u &= 0x0f0f0f0f0f0f0f0fL;
if ((u + 0x0606060606060606L) & 0xf0f0f0f0f0f0f0f0L) return 0xffffffffffffffffL;
return u;
Standard disclaimers apply. No guarantees this code is good for anything. Also, I don't normally format my code as listed, but I am having trouble getting Blogger to indent properly.
0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home