|
dev
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
searching byte arrays and RTFreadersDoes anyone know of a .net technique that can quickly find a particular byte value within a byte array and return its position, much as instr does for strings in VB.NET? I'm writing my own custom RTFreader by having my class inherit the ..NET System.IO.BinaryReader. My goal is to have a stream reader available in .NET that returns the text portion of an RTF file stream - less any formatting. Being a binary reader, my custom reader loads n-bytes into a byte array. It then loops through this array and only returns qualifying bytes to the calling routine. Some RTF files contain embedded graphics and objects, and these can be made up of hundreds of kilobytes of data. When my custom reader encounters one of these blocks, it spends alot of time looping through the object's bytes just to bypass them. This can cause a significant delay for my reader to return from a call. To speed things up, I thought I could calculate the offset to the end of a block. Each object and graphic in an RTF file is described by a \pict or \object group. These also have tags that describe the object's dimensions. I thought I could take the dimensions, multiply them together and divide that by 255 (because the object data is byte-64 encoded) and multiply by two (because each encoded byte is represented by two hex digits - A5, EF, FF etc) to give me the length of the RTF file's encoded block. This would have worked, except that the width and height dimensions in the RTF file are in twips and not pixels. So, I'm back to looping through the byte array to find the delimiter that marks the end of a block. Is there a facility in .NET that can perform this search at machine language speed and return the location of the found byte? I heard that ..NET's regex class might be able to do this, but doesn't that also only work with strings? Andy I found a method that is a lot faster than manually looping through an
array and checking each element for a value (it searched 500Kb in less than a second) The Array class in .net has a method called "indexof" that basically is a version of instr. It can search an array for a value that is stored in the format of the array's typedef. If the value you are looking for can't be found, indexof returns a -1. Otherwise, it returns the index of the element that the value is in. ie dim valueToFind as byte = 5 dim YourArray() as byte = {1,2,3,4,5,6,7,8,9} dim startingLocation as integer = 1 location = Array.IndexOf(YourArray, valueToFind, startingLocation) B.T.W.
Array.IndexOf does a search looking for an object in the array that matches the object that you've told it to find. Because objects are used in the comparison, no type conversions take place. Instead, Array.IndexOf calls the Equals method on the objects to determine whether a match has occured. This means that the object to find has to have the same typedef as the objects in the array. For example, if you have a byte array that has some elements that contain the value 32, setting the object to find to a numeric literal such as 32 or to a variable that is a System.Int16 that holds the value 32 won't work. This is because even though the values may match between the elements with 32 in the byte array and the search value of 32, their typedefs (which do NOT match) fail the Equals test. |
|||||||||||||||||||||||