org.apache.poi.util
Class StringUtil

java.lang.Object
  extended by org.apache.poi.util.StringUtil

public class StringUtil
extends java.lang.Object

Title: String Utility Description: Collection of string handling utilities

Since:
May 10, 2002
Version:
1.0
Author:
Andrew C. Oliver, Sergei Kozello (sergeikozello at mail.ru), Toshiaki Kamoshida (kamoshida.toshiaki at future dot co dot jp)

Method Summary
static java.lang.String format(java.lang.String message, java.lang.Object[] params)
          Apply printf() like formatting to a string.
static java.lang.String getFromCompressedUnicode(byte[] string, int offset, int len)
          Read 8 bit data (in ISO-8859-1 codepage) into a (unicode) Java String and return.
static java.lang.String getFromUnicodeBE(byte[] string)
          Given a byte array of 16-bit unicode characters in big endian format (most important byte first), return a Java String representation of it.
static java.lang.String getFromUnicodeBE(byte[] string, int offset, int len)
          Given a byte array of 16-bit unicode characters in big endian format (most important byte first), return a Java String representation of it.
static java.lang.String getFromUnicodeLE(byte[] string)
          Given a byte array of 16-bit unicode characters in little endian format (most important byte last), return a Java String representation of it.
static java.lang.String getFromUnicodeLE(byte[] string, int offset, int len)
          Given a byte array of 16-bit unicode characters in Little Endian format (most important byte last), return a Java String representation of it.
static java.lang.String getPreferredEncoding()
           
static boolean hasMultibyte(java.lang.String value)
          check the parameter has multibyte character
static boolean isUnicodeString(java.lang.String value)
          Checks to see if a given String needs to be represented as Unicode
static void putCompressedUnicode(java.lang.String input, byte[] output, int offset)
          Takes a unicode (java) string, and returns it as 8 bit data (in ISO-8859-1 codepage).
static void putUnicodeBE(java.lang.String input, byte[] output, int offset)
          Takes a unicode string, and returns it as big endian (most important byte first) bytes in the supplied byte array.
static void putUnicodeLE(java.lang.String input, byte[] output, int offset)
          Takes a unicode string, and returns it as little endian (most important byte last) bytes in the supplied byte array.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

getFromUnicodeLE

public static java.lang.String getFromUnicodeLE(byte[] string,
                                                int offset,
                                                int len)
                                         throws java.lang.ArrayIndexOutOfBoundsException,
                                                java.lang.IllegalArgumentException
Given a byte array of 16-bit unicode characters in Little Endian format (most important byte last), return a Java String representation of it. { 0x16, 0x00 } -0x16

Parameters:
string - the byte array to be converted
offset - the initial offset into the byte array. it is assumed that string[ offset ] and string[ offset + 1 ] contain the first 16-bit unicode character
len - the length of the final string
Returns:
the converted string
Throws:
java.lang.ArrayIndexOutOfBoundsException - if offset is out of bounds for the byte array (i.e., is negative or is greater than or equal to string.length)
java.lang.IllegalArgumentException - if len is too large (i.e., there is not enough data in string to create a String of that length)

getFromUnicodeLE

public static java.lang.String getFromUnicodeLE(byte[] string)
Given a byte array of 16-bit unicode characters in little endian format (most important byte last), return a Java String representation of it. { 0x16, 0x00 } -0x16

Parameters:
string - the byte array to be converted
Returns:
the converted string

getFromUnicodeBE

public static java.lang.String getFromUnicodeBE(byte[] string,
                                                int offset,
                                                int len)
                                         throws java.lang.ArrayIndexOutOfBoundsException,
                                                java.lang.IllegalArgumentException
Given a byte array of 16-bit unicode characters in big endian format (most important byte first), return a Java String representation of it. { 0x00, 0x16 } -0x16

Parameters:
string - the byte array to be converted
offset - the initial offset into the byte array. it is assumed that string[ offset ] and string[ offset + 1 ] contain the first 16-bit unicode character
len - the length of the final string
Returns:
the converted string
Throws:
java.lang.ArrayIndexOutOfBoundsException - if offset is out of bounds for the byte array (i.e., is negative or is greater than or equal to string.length)
java.lang.IllegalArgumentException - if len is too large (i.e., there is not enough data in string to create a String of that length)

getFromUnicodeBE

public static java.lang.String getFromUnicodeBE(byte[] string)
Given a byte array of 16-bit unicode characters in big endian format (most important byte first), return a Java String representation of it. { 0x00, 0x16 } -0x16

Parameters:
string - the byte array to be converted
Returns:
the converted string

getFromCompressedUnicode

public static java.lang.String getFromCompressedUnicode(byte[] string,
                                                        int offset,
                                                        int len)
Read 8 bit data (in ISO-8859-1 codepage) into a (unicode) Java String and return. (In Excel terms, read compressed 8 bit unicode as a string)

Parameters:
string - byte array to read
offset - offset to read byte array
len - length to read byte array
Returns:
String generated String instance by reading byte array

putCompressedUnicode

public static void putCompressedUnicode(java.lang.String input,
                                        byte[] output,
                                        int offset)
Takes a unicode (java) string, and returns it as 8 bit data (in ISO-8859-1 codepage). (In Excel terms, write compressed 8 bit unicode)

Parameters:
input - the String containing the data to be written
output - the byte array to which the data is to be written
offset - an offset into the byte arrat at which the data is start when written

putUnicodeLE

public static void putUnicodeLE(java.lang.String input,
                                byte[] output,
                                int offset)
Takes a unicode string, and returns it as little endian (most important byte last) bytes in the supplied byte array. (In Excel terms, write uncompressed unicode)

Parameters:
input - the String containing the unicode data to be written
output - the byte array to hold the uncompressed unicode, should be twice the length of the String
offset - the offset to start writing into the byte array

putUnicodeBE

public static void putUnicodeBE(java.lang.String input,
                                byte[] output,
                                int offset)
Takes a unicode string, and returns it as big endian (most important byte first) bytes in the supplied byte array. (In Excel terms, write uncompressed unicode)

Parameters:
input - the String containing the unicode data to be written
output - the byte array to hold the uncompressed unicode, should be twice the length of the String
offset - the offset to start writing into the byte array

format

public static java.lang.String format(java.lang.String message,
                                      java.lang.Object[] params)
Apply printf() like formatting to a string. Primarily used for logging.

Parameters:
message - the string with embedded formatting info eg. "This is a test %2.2"
params - array of values to format into the string
Returns:
The formatted string

getPreferredEncoding

public static java.lang.String getPreferredEncoding()
Returns:
the encoding we want to use, currently hardcoded to ISO-8859-1

hasMultibyte

public static boolean hasMultibyte(java.lang.String value)
check the parameter has multibyte character

Parameters:
value - string to check
Returns:
boolean result true:string has at least one multibyte character

isUnicodeString

public static boolean isUnicodeString(java.lang.String value)
Checks to see if a given String needs to be represented as Unicode

Parameters:
value -
Returns:
true if string needs Unicode to be represented.


Copyright 2008 The Apache Software Foundation or its licensors, as applicable.