Friday, December 28, 2007

Available character sets in JAVA

A charset name must begin with either a letter or a digit. The empty string is not a legal charset name. Charset names are not case-sensitive; that is, case is always ignored when comparing charset names. Charset names generally follow the conventions documented in RFC 2278: IANA Charset Registration Procedures.

Charsets are named by strings composed of the following characters:

Charset

Description

US-ASCIISeven-bit ASCII, a.k.a. ISO646-US,a.k.a. the Basic Latin block of the Unicode character set
ISO-8859-1 ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1
UTF-8Eight-bit UCS Transformation Format
UTF-16BESixteen-bit UCS Transformation Format,big-endian byte order
UTF-16LESixteen-bit UCS Transformation Format,little-endian byte order
UTF-16Sixteen-bit UCS Transformation Format,byte order identified by an optional byte-order mark (BOM)

Read More

Available Charactersets

java.lang.Charset.availableCharsets(); list all supporte character set by this JVM as like below.

{Big5=Big5, Big5-HKSCS=Big5-HKSCS, EUC-JP=EUC-JP, EUC-KR=EUC-KR, GB18030=GB18030, GB2312=GB2312, GBK=GBK, IBM-Thai=IBM-Thai, IBM00858=IBM00858, IBM01140=IBM01140, IBM01141=IBM01141, IBM01142=IBM01142, IBM01143=IBM01143, IBM01144=IBM01144, IBM01145=IBM01145, IBM01146=IBM01146, IBM01147=IBM01147, IBM01148=IBM01148, IBM01149=IBM01149, IBM037=IBM037, IBM1026=IBM1026, IBM1047=IBM1047, IBM273=IBM273, IBM277=IBM277, IBM278=IBM278, IBM280=IBM280, IBM284=IBM284, IBM285=IBM285, IBM297=IBM297, IBM420=IBM420, IBM424=IBM424, IBM437=IBM437, IBM500=IBM500, IBM775=IBM775, IBM850=IBM850, IBM852=IBM852, IBM855=IBM855, IBM857=IBM857, IBM860=IBM860, IBM861=IBM861, IBM862=IBM862, IBM863=IBM863, IBM864=IBM864, IBM865=IBM865, IBM866=IBM866, IBM868=IBM868, IBM869=IBM869, IBM870=IBM870, IBM871=IBM871, IBM918=IBM918, ISO-2022-CN=ISO-2022-CN, ISO-2022-JP=ISO-2022-JP, ISO-2022-KR=ISO-2022-KR, ISO-8859-1=ISO-8859-1, ISO-8859-13=ISO-8859-13, ISO-8859-15=ISO-8859-15, ISO-8859-2=ISO-8859-2, ISO-8859-3=ISO-8859-3, ISO-8859-4=ISO-8859-4, ISO-8859-5=ISO-8859-5, ISO-8859-6=ISO-8859-6, ISO-8859-7=ISO-8859-7, ISO-8859-8=ISO-8859-8, ISO-8859-9=ISO-8859-9, JIS_X0201=JIS_X0201, JIS_X0212-1990=JIS_X0212-1990, KOI8-R=KOI8-R, Shift_JIS=Shift_JIS, TIS-620=TIS-620, US-ASCII=US-ASCII, UTF-16=UTF-16, UTF-16BE=UTF-16BE, UTF-16LE=UTF-16LE, UTF-8=UTF-8, windows-1250=windows-1250, windows-1251=windows-1251, windows-1252=windows-1252, windows-1253=windows-1253, windows-1254=windows-1254, windows-1255=windows-1255, windows-1256=windows-1256, windows-1257=windows-1257, windows-1258=windows-1258, windows-31j=windows-31j, x-Big5-Solaris=x-Big5-Solaris, x-euc-jp-linux=x-euc-jp-linux, x-EUC-TW=x-EUC-TW, x-eucJP-Open=x-eucJP-Open, x-IBM1006=x-IBM1006, x-IBM1025=x-IBM1025, x-IBM1046=x-IBM1046, x-IBM1097=x-IBM1097, x-IBM1098=x-IBM1098, x-IBM1112=x-IBM1112, x-IBM1122=x-IBM1122, x-IBM1123=x-IBM1123, x-IBM1124=x-IBM1124, x-IBM1381=x-IBM1381, x-IBM1383=x-IBM1383, x-IBM33722=x-IBM33722, x-IBM737=x-IBM737, x-IBM834=x-IBM834, x-IBM856=x-IBM856, x-IBM874=x-IBM874, x-IBM875=x-IBM875, x-IBM921=x-IBM921, x-IBM922=x-IBM922, x-IBM930=x-IBM930, x-IBM933=x-IBM933, x-IBM935=x-IBM935, x-IBM937=x-IBM937, x-IBM939=x-IBM939, x-IBM942=x-IBM942, x-IBM942C=x-IBM942C, x-IBM943=x-IBM943, x-IBM943C=x-IBM943C, x-IBM948=x-IBM948, x-IBM949=x-IBM949, x-IBM949C=x-IBM949C, x-IBM950=x-IBM950, x-IBM964=x-IBM964, x-IBM970=x-IBM970, x-ISCII91=x-ISCII91, x-ISO-2022-CN-CNS=x-ISO-2022-CN-CNS, x-ISO-2022-CN-GB=x-ISO-2022-CN-GB, x-iso-8859-11=x-iso-8859-11, x-JIS0208=x-JIS0208, x-JISAutoDetect=x-JISAutoDetect, x-Johab=x-Johab, x-MacArabic=x-MacArabic, x-MacCentralEurope=x-MacCentralEurope, x-MacCroatian=x-MacCroatian, x-MacCyrillic=x-MacCyrillic, x-MacDingbat=x-MacDingbat, x-MacGreek=x-MacGreek, x-MacHebrew=x-MacHebrew, x-MacIceland=x-MacIceland, x-MacRoman=x-MacRoman, x-MacRomania=x-MacRomania, x-MacSymbol=x-MacSymbol, x-MacThai=x-MacThai, x-MacTurkish=x-MacTurkish, x-MacUkraine=x-MacUkraine, x-MS950-HKSCS=x-MS950-HKSCS, x-mswin-936=x-mswin-936, x-PCK=x-PCK, x-windows-50220=x-windows-50220, x-windows-50221=x-windows-50221, x-windows-874=x-windows-874, x-windows-949=x-windows-949, x-windows-950=x-windows-950, x-windows-iso2022jp=x-windows-iso2022jp}

To display file encoding character set used in JAVA:

System.out.println(System.getProperty("file.encoding"));

To display default Character Set used:

System.out.println(Charset.defaultCharset().displayName());

Muti-byte file

Most of the text editors like Programmers Notepad, TextEdit are helps to create UTF-16 encoded file. However, We can not differenciate by opening multi-byte character files using these editors. Through command line based editors are very useful like Editor(windows) and cat(windows/linux).

Thursday, December 20, 2007

Unzipping using java.util.zip Package

How to unzip using java.util.zip Package ?
How to unzip using JAVA API ?.

Create ZipInputStream using any of the InputStream and read one by one ZIPEntry. Make if check ZIPEntry points to file object and the read bytes from ZipInputStream upto -1 byte found.


import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;

public class Unzip {

private static final String FILESEPARATOR = File.separator;

public static void storeZipStream(InputStream inputStream, 
String dir)
throws IOException {

ZipInputStream zis = new ZipInputStream(inputStream);
ZipEntry entry = null;
int countEntry = 0;
if (!dir.endsWith(FILESEPARATOR))
dir += FILESEPARATOR;

// check inputStream is ZIP or not
if ((entry = zis.getNextEntry()) != null) {
do {
String entryName = entry.getName();
// Directory Entry should end with FileSeparator
if (!entry.isDirectory()) {
// Directory will be created while creating file with in it.
String fileName = dir + entryName;
createFile(zis, fileName);
countEntry++;
}
} while ((entry = zis.getNextEntry()) != null);
System.out.println("No of files Extracted : " + countEntry);

} else {
throw new IOException("Given file is not a Compressed one");
}
}

public static void createFile(InputStream is, 
String absoluteFileName)
throws IOException {
File f = new File(absoluteFileName);

if (!f.getParentFile().exists())
f.getParentFile().mkdirs();
OutputStream out = new FileOutputStream(absoluteFileName);
byte[] buf = new byte[1024];
int len = 0;
while ((len = is.read(buf)) > 0) {
out.write(buf, 0, len);
}
// Close the streams
out.close();
}

public static void main(String args[]) throws Exception {

if (args.length < 1) {
System.out.println("Syntax : Unzip zipfile [extractlocation]");
return;
}

FileInputStream zis = new FileInputStream(new File(args[0]));
String dir = System.getProperty("java.io.tmpdir");
if (args.length == 2) {
dir = args[1].replace('\\', '/');
}
System.out.println("Extracted to "+dir);
storeZipStream(zis, dir);
}
}

To run:

java Unzip a.zip /tmp/a

a.zip content will be extracted to /tmp/a folder.

Tuesday, December 11, 2007

Create String XML without XMLDocument

How to create XML Document without using the XML Document Object ?
How to create XML Document using JAVA API ?

is there any easy way which helps to create XML Document programmatically without relying on any of the XML Parser Implementation ?.

I have created XMLDocument based on xerces, but my clients are using Oracle XDK, what to do ?.


Yes. Some features availed by giving customized methods and classes in Parser implementation. But, the same feature may not be available in other implementation.

Best and sweet approaches is, don't implement your application based on particular XML Parser implementation and try to use always the packages comes from rt.jar(javax.xml.*; org.w3c.*;). Some highly required features may not be available in these packages, hence we will force to use implementation specific features from parser implementation.

Before making use of packages which is not follows JSR 5 suggestions, do double check is that same method will be available for other parser, how they are handling the same feature.

Sometime Feature will give big head ache for us.

Here is the code how to convert XML Document Object to String xml-to-string-object.html

An Element is combination of child Elements and attributes and tagName, hence
create a class with these three components.

private String tagName;
private Map attrList = new HashMap();
private List childList = new ArrayList();

To call and get XML content as output
Tag tag = new Tag("Employees");
tag.getAttrList().put("name","Alex");
tag.getAttrList().put("address","ch 8509 zurich");
Tag depttag=new Tag("Department");
depttag.getAttrList().put("name","Engineering");
tag.getChildList().add(depttag);

System.out.println(tag.generatewithProcessTag());

the output will be
<?xml version=\"1.0\"?>
<employees address="ch 8509 zurich" name="Alex">
<department name="Engineering">
</employees>

To generate attribute and values to add in Element tag
for (Map.Entry attribute : attrMap.entrySet()) {
str.append(WHITE_SPACE).append(attribute.getKey()).append(EQUAL) .append(DOUBLE_QUOTE).append(attribute.getValue()).append( DOUBLE_QUOTE); }

To create Element as String object
StringBuilder str = new StringBuilder();
str.append(LT).append(getTagName());
if (!getAttrList().isEmpty()) { str.append(generateAtribute(getAttrList()));}

if (!getChildList().isEmpty()) {
str.append(GT);
for (Tag childTag : getChildList()) {
str.append(NEWLINE).append(childTag.generateTag());
}
str.append(NEWLINE).append(LT).append(BACK_SLASH).append( getTagName()).append(GT);} else { str.append(WHITE_SPACE).append(BACK_SLASH).append(GT);
}





Here i have explained almost all how to do implementation to get XML document without relying on any of the XML implementations. Here, you can add n number of element without worrying about the implementation of org.w3c.dom.Document object, i.e no XML Parser implementation is required to create XML document.

This implementation contains only one class with 128 line of code. This will improve performance of XML document creation(obviously less feature will do better), and eates less heap size.


Please feel free to ask implementation by sending mail or leaving your valuable comments.

Sunday, December 2, 2007

Find Class in JAR

How will i get to know which jar has required Class ?.

I do get NoClassDefFoundError or ClassNotFoundError, is there any approach to identify that the missed class is located in this jar and in which package ?.



In unix and windows(with unix utils) we have a spy(we can say) named as grep. This command used to search the specific words in a file. We can use the same command to find out the class located in jar file.

Unix :
grep -r *.jar

Windows:
grep -r *

In windows, sometime wildcard only for jar will not work. This may be issue with my machine. But try for customizing wild card, if works, show little smile.


The above command will list referenced and refering jar list, we have to identify manually walkthroughwing the content of JAR using tools like WinZIP, WINRAR, Stuffit.

Saturday, December 1, 2007

XML to String Object

How to convert a XML file to string ?
How to conver the XML Node to String format ?.

Oh, my XML parser implementation does not have directly to convert as a String. toString() method just returns object information what to do?.

Yes, in this moment, we have to write a our own logic to convert the Node as a String object. Sun Microsystems javax.xml package serves transforming the XML document to String irrespective of the XML parser Implementation(refer JSR 5 and JSR 173 and etc.,)

Even upto individual Node level transformation is possible.

  • XML Document to String
  • XML element to String
  • XML Node to String

Import following package classses apart from your application class import and org.w3c.* package.


import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;

Write following method and conver simply your Node as String


public static String xmlToString(Node node) {
try {
Source source = new DOMSource(node);
StringWriter stringWriter = new StringWriter();
Result result = new StreamResult(stringWriter);
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer();
transformer.transform(source, result);
return stringWriter.getBuffer().toString();
} catch (TransformerConfigurationException e) {
e.printStackTrace();
} catch (TransformerException e) {
e.printStackTrace();
}
return null;
}

Using Oracle implementation, we can do this XMLDocument to String by giving single command.

((XMLDocument)doc).print(System.out); 
//any of the output stream , enjoy :)

Recent Posts

Unix Commands | List all My Posts

Texts

This blog intended to share the knowledge and contribute to JAVA Community such a way that by providing samples and pointing right documents/webpages. We try to give our knowledege level best and no guarantee can be claimed on truth. Copyright and Terms of Policy refer blogspot.com

Share

Twitter Delicious Facebook Digg Stumbleupon Favorites More