I have a file format (fastq format) that encodes a string of integers in the form of a string where each integer Code with an ascii offset Unfortunately, there are two encodings in normal use, an offset offset of 33 and the second offset off of 64. I usually have 100 million strings of 80-150 length to change from one offset to another. The simplest code is that I can come up with this type of talk:
def phred64ToStdqual (qualin): return (''. Joining ([chr (ord (x) - 31) for x in the x)))
It works just fine, but it's not particularly fast, for 1 million stars, it's about 4 seconds on my machine Seem to be If I change to use some rules to translate, then I can get it for about 2 seconds.
IOOC = {}: IoOCO [ii] = CRI (i) CTE [CRR (I)] = IIFFrade 64 TS & QWall 2 (QWALI) for the occurrence in xrange (127) : To join the return (''.) () ([IOAC [CTI] [x] -31] X in the qailline]]
If I blindly run under the rule If I go, then I lower it 1 second.
It seems that at the C-level, it is only an integer, decreases, and then the four is cast. I have not written this, but I'm guessing that it is quite fast.
Cyan
If you look at the code of urllib.quote, then something like this is what you are doing. If it does not look like _map: for category (31, 127): _map [chr (i)] = chr (i - 31) returns '' .joint (map_map .__getitem__, qualin))
Note that mappings are not the same length (above urlib.quote, you '%' -> But in reality, since each translation is of the same length, there is such a function of a dragon that makes it very quickly: And maybe you will not get very fast:
import string _trans = none def phred64ToStdqual4 (qualin): and If not _trans: _trans = string.maketrans ('' for category (chr (i) (31, 127)), '' for category (ii) (127 - 31)) Jones (CRR (i)) Returns QuinineTranslate (_trans)
Comments
Post a Comment