Abdel-Rahman M. Jaradat and Mansour I. Irshid and Talha T. Nassar
A File Splitting Technique for Reducing the Entropy of Text Files
2138 - 2142
2007
1
7
International Journal of Computer and Information Engineering
https://publications.waset.org/pdf/9948
https://publications.waset.org/vol/7
World Academy of Science, Engineering and Technology
A novel file splitting technique for the reduction of the nthorder entropy of text files is proposed. The technique is based on mapping the original text file into a nonASCII binary file using a new codeword assignment method and then the resulting binary file is split into several subfiles each contains one or more bits from each codeword of the mapped binary file. The statistical properties of the subfiles are studied and it is found that they reflect the statistical properties of the original text file which is not the case when the ASCII code is used as a mapper. The nthorder entropy of these subfiles are determined and it is found that the sum of their entropies is less than that of the original text file for the same values of extensions. These interesting statistical properties of the resulting subfiles can be used to achieve better compression ratios when conventional compression techniques are applied to these subfiles individually and on a bitwise basis rather than on characterwise basis.
Open Science Index 7, 2007