Home All Groups Group Topic Archive Search About

compress specific data

Author
21 Dec 2004 2:46 PM
Romain TAILLANDIER

Hi group

My customer want me i crypt a 16 alphanumeric char in 8 byte of memory.
the format is :
AAA AAA AAA L NNN NNN
where the A are      A-Z or 0-9
                L are      A-Z
                N are     0-9

So i caculate the number of possibilities :
36^9 * 26 * 10^6  = 2,640,558,873,378,816,000,000
                               < 2^64

So i think it is possible.
but i can't lists all the possibilities. So i create some 'ASCII' specific
code :
the AlphaNum (36 possibilities) coded on 6 bits
       Letters    (26 possibilities) on 5 bits
        Numbers (10 possibilities) on 4 bits
and finally i get 10 bytes ....

what ever i can do i can't get down under 9 bytes.
the only possibility is to list all the 2,640,. ..... ... possibility but it
is not realist.

any ideas ?
calculate an hashcode on the 16 alphanum code woud be save me ?

ROM
Author
21 Dec 2004 4:07 PM
Jon Skeet [C# MVP]
Romain TAILLANDIER <RomainDotTaillandier_nospam@MaintagDotCom_remove>
wrote:
Show quoteHide quote
> My customer want me i crypt a 16 alphanumeric char in 8 byte of memory.
> the format is :
> AAA AAA AAA L NNN NNN
> where the A are      A-Z or 0-9
>                 L are      A-Z
>                 N are     0-9
>
> So i caculate the number of possibilities :
> 36^9 * 26 * 10^6  = 2,640,558,873,378,816,000,000
>                                < 2^64
>
> So i think it is possible.
> but i can't lists all the possibilities. So i create some 'ASCII' specific
> code :
> the AlphaNum (36 possibilities) coded on 6 bits
>        Letters    (26 possibilities) on 5 bits
>         Numbers (10 possibilities) on 4 bits
> and finally i get 10 bytes ....
>
> what ever i can do i can't get down under 9 bytes.
> the only possibility is to list all the 2,640,. ..... ... possibility but it
> is not realist.
>
> any ideas ?

Well, the most obvious suggestion is to convert each character into its
appropriate base, ie 0-35 for the first nine characters, then 0-25 for
the next, then 0-9 for the last 6. Then just take the first result +
36*second + (36*36)*third etc (or the other way round depending on what
endianness you want).

Let me know if you need code to do this.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Are all your drivers up to date? click for free checkup

Author
22 Dec 2004 8:34 AM
Romain TAILLANDIER
> Well, the most obvious suggestion is to convert each character into its
> appropriate base, ie 0-35 for the first nine characters, then 0-25 for
> the next, then 0-9 for the last 6. Then just take the first result +
> 36*second + (36*36)*third etc (or the other way round depending on what
> endianness you want).

that exactly the night say to me :)
It was so simple that i can't see it on the moment :)

thank you john skeet :)

ROM
Author
22 Dec 2004 9:05 AM
Nick Malik [Microsoft]
Hello Romain.

The problem you have is, basically, rounding error.

> So i caculate the number of possibilities :
> 36^9 * 26 * 10^6  = 2,640,558,873,378,816,000,000
>                                < 2^64
>
> So i think it is possible.

Technically, it is possible.  But look at this calculation a litle closer:
> the AlphaNum (36 possibilities) coded on 6 bits
>        Letters    (26 possibilities) on 5 bits
>         Numbers (10 possibilities) on 4 bits
> and finally i get 10 bytes ....

The problem is this: while it takes 6 bits to store 36 possibilities, 6 bits
actually hold 64 possible numbers.  So, you are "wasting" 28 possibilities
per position.
Similarly, 5 bits holds 32 possibilities, of which you are use 26, and 4
bits hold 16 possibilities, of which you are using 10.

You may be able to do this as a "moving base" number.  Remember that a
number, in any base, is equivalent to multiplying the position of the digit
times the base value to the power of the position (from the left).  So the
decimal (base 10) number 520 is equal to (((5 * 10) + 2) * 10) + 0

In this way, your 16 byte value can be coded as an eight byte number by
multiplying each "digit" by the base.  The complication is that you don't
have a single base to work with.

So, you need a moving base.  It goes something like this:

public decimal newvalue(oldvalue, insertedvalue, newbase)
{
   return (oldvalue * newbase) + insertedvalue;
}

int[] baselist = {36, 36, 36, 36, 36, 36, 36, 36, 36, 26, 10, 10, 10, 10,
10, 10};

// loop through the string, one character at a time
   decimal register = 0;
   for (int ct = 0; ct < 16; ct++)
   {
      char mychar = sourcestring[ct];
      int numvar = GetNumericEquivalent(mychar, baselist[ct]);   // write a
little function that returns the "numeric value" of each character
     register = newvalue(register, numver, baselist[ct]);
   }
   return register;

The resulting value will be stored in the binary 'register' value.

Note: the GetNumericEquivalent routine above is simple:
int GetNumericEquivalent(char mychar, int mybase)
{
   string base10 = "0123456789";
   string base26= "ABCDEF ... XYZ";
   string base36 = base10 + base26;

   if (mybase = 10)
      return base10.IndexOf(mychar);
  else if (mybase = 26)
      return base26.IndexOf(mychar);
  else
     return base36.IndexOf(mychar);
}

By the way, this code is NOT object oriented.  It is entirely procedural.  I
did this because I felt that it was more important to illustrate the
algorithm than it is to encapsulate the classes in this case.  If you'd
like, we could create a flyweight pattern to encapsulate the data values
with their respective bases.  However, the algorithm would have been
obscured.  Apologies to any purists out there.

--
--- Nick Malik [Microsoft]
    MCSD, CFPS, Certified Scrummaster
    http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
   I do not answer questions on behalf of my employer.  I'm just a
programmer helping programmers.
--
Show quoteHide quote
"Romain TAILLANDIER" <RomainDotTaillandier_nospam@MaintagDotCom_remove>
wrote in message news:%23FSfIv25EHA.2704@TK2MSFTNGP10.phx.gbl...
> Hi group
>
> My customer want me i crypt a 16 alphanumeric char in 8 byte of memory.
> the format is :
> AAA AAA AAA L NNN NNN
> where the A are      A-Z or 0-9
>                 L are      A-Z
>                 N are     0-9
>
> So i caculate the number of possibilities :
> 36^9 * 26 * 10^6  = 2,640,558,873,378,816,000,000
>                                < 2^64
>
> So i think it is possible.
> but i can't lists all the possibilities. So i create some 'ASCII' specific
> code :
> the AlphaNum (36 possibilities) coded on 6 bits
>        Letters    (26 possibilities) on 5 bits
>         Numbers (10 possibilities) on 4 bits
> and finally i get 10 bytes ....
>
> what ever i can do i can't get down under 9 bytes.
> the only possibility is to list all the 2,640,. ..... ... possibility but
it
> is not realist.
>
> any ideas ?
> calculate an hashcode on the 16 alphanum code woud be save me ?
>
> ROM
>
>
>

Bookmark and Share