1 \input texinfo @c -*-texinfo-*-
4 @setfilename hacking.info
5 @settitle GNU Classpath Hacker's Guide
11 This file contains important information you will need to know if you
12 are going to hack on the GNU Classpath project code.
14 Copyright (C) 1998,1999,2000,2001,2002,2003,2004, 2005 Free Software Foundation, Inc.
17 @dircategory GNU Libraries
19 * Classpath Hacking: (hacking). GNU Classpath Hacker's Guide
25 @title GNU Classpath Hacker's Guide
27 @author Paul N. Fisher
29 @author C. Brian Jones
30 @author Mark J. Wielaard
33 @vskip 0pt plus 1filll
34 Copyright @copyright{} 1998,1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc.
36 Permission is granted to make and distribute verbatim copies of
37 this document provided the copyright notice and this permission notice
38 are preserved on all copies.
40 Permission is granted to copy and distribute modified versions of this
41 document under the conditions for verbatim copying, provided that the
42 entire resulting derived work is distributed under the terms of a
43 permission notice identical to this one.
45 Permission is granted to copy and distribute translations of this manual
46 into another language, under the above conditions for modified versions,
47 except that this permission notice may be stated in a translation
48 approved by the Free Software Foundation.
53 @node Top, Introduction, (dir), (dir)
54 @top GNU Classpath Hacker's Guide
56 This document contains important information you'll want to know if
57 you want to hack on GNU Classpath, Essential Libraries for Java, to
58 help create free core class libraries for use with virtual machines
59 and compilers for the java programming language.
63 * Introduction:: An introduction to the GNU Classpath project
64 * Requirements:: Very important rules that must be followed
65 * Volunteering:: So you want to help out
66 * Project Goals:: Goals of the GNU Classpath project
67 * Needed Tools and Libraries:: A list of programs and libraries you will need
68 * Programming Standards:: Standards to use when writing code
69 * Hacking Code:: Working on code, Working with others
70 * Programming Goals:: What to consider when writing code
71 * API Compatibility:: How to handle serialization and deprecated methods
72 * Specification Sources:: Where to find class library specs
73 * Naming Conventions:: How files and directories are named
74 * Character Conversions:: Working on Character conversions
75 * Localization:: How to handle localization/internationalization
78 --- The Detailed Node Listing ---
82 * Source Code Style Guide::
84 Working on the code, Working with others
87 * Writing ChangeLogs::
91 * Writing ChangeLogs::
95 * Portability:: Writing Portable Software
96 * Utility Classes:: Reusing Software
97 * Robustness:: Writing Robust Software
98 * Java Efficiency:: Writing Efficient Java
99 * Native Efficiency:: Writing Efficient JNI
100 * Security:: Writing Secure Software
104 * Serialization:: Serialization
105 * Deprecated Methods:: Deprecated methods
109 * String Collation:: Sorting strings in different locales
110 * Break Iteration:: Breaking up text into words, sentences, and lines
111 * Date Formatting and Parsing:: Locale specific date handling
112 * Decimal/Currency Formatting and Parsing:: Local specific number handling
117 @node Introduction, Requirements, Top, Top
118 @comment node-name, next, previous, up
119 @chapter Introduction
121 The GNU Classpath Project is a dedicated to providing a 100% free,
122 clean room implementation of the standard core class libraries for
123 compilers and runtime environments for the java programming language.
124 It offers free software developers an alternative core library
125 implementation upon which larger java-like programming environments
126 can be build. The GNU Classpath Project was started in the Spring of
127 1998 as an official Free Software Foundation project. Most of the
128 volunteers working on GNU Classpath do so in their spare time, but a
129 couple of projects based on GNU Classpath have paid programmers to
130 improve the core libraries. We appreciate everyone's efforts in the
131 past to improve and help the project and look forward to future
132 contributions by old and new members alike.
134 @node Requirements, Volunteering, Introduction, Top
135 @comment node-name, next, previous, up
136 @chapter Requirements
138 Although GNU Classpath is following an open development model where input
139 from developers is welcome, there are certain base requirements that
140 need to be met by anyone who wants to contribute code to this project.
141 They are mostly dictated by legal requirements and are not arbitrary
142 restrictions chosen by the GNU Classpath team.
144 You will need to adhere to the following things if you want to donate
145 code to the GNU Classpath project:
149 @strong{Never under any circumstances refer to proprietary code while
150 working on GNU Classpath.} It is best if you have never looked at
151 alternative proprietary core library code at all. To reduce
152 temptation, it would be best if you deleted the @file{src.zip} file
153 from your proprietary JDK distribution (note that recent versions of
154 GNU Classpath and the compilers and environments build on it are
155 mature enough to not need any proprietary implementation at all when
156 working on GNU Classpath, except in exceptional cases where you need
157 to test compatibility issues pointed out by users). If you have
158 signed Sun's non-disclosure statement, then you unfortunately cannot
159 work on Classpath code at all. If you have any reason to believe that
160 your code might be ``tainted'', please say something on the mailing
161 list before writing anything. If it turns out that your code was not
162 developed in a clean room environment, we could be very embarrassed
163 someday in court. Please don't let that happen.
166 @strong{Never decompile proprietary class library implementations.} While
167 the wording of the license in Sun's Java 2 releases has changed, it is
168 not acceptable, under any circumstances, for a person working on
169 GNU Classpath to decompile Sun's class libraries. Allowing the use of
170 decompilation in the GNU Classpath project would open up a giant can of
171 legal worms, which we wish to avoid.
174 Classpath is licensed under the terms of the
175 @uref{http://www.fsf.org/copyleft/gpl.html,GNU General Public
176 License}, with a special exception included to allow linking with
177 non-GPL licensed works as long as no other license would restrict such
178 linking. To preserve freedom for all users and to maintain uniform
179 licensing of Classpath, we will not accept code into the main
180 distribution that is not licensed under these terms. The exact
181 wording of the license of the current version of GNU Classpath can be
182 found online from the
183 @uref{http://www.gnu.org/software/classpath/license.html, GNU
184 Classpath license page} and is of course distributed with current
185 snapshot release from @uref{ftp://ftp.gnu.org/gnu/classpath/} or by
186 obtaining a copy of the current CVS tree.
189 GNU Classpath is GNU software and this project is being officially sponsored
190 by the @uref{http://www.fsf.org/,Free Software Foundation}. Because of
191 this, the FSF will hold copyright to all code developed as part of
192 GNU Classpath. This will allow them to pursue copyright violators in court,
193 something an individual developer may neither have the time nor
194 resources to do. Everyone contributing code to GNU Classpath will need to
195 sign a copyright assignment statement. Additionally, if you are
196 employed as a programmer, your employer may need to sign a copyright
197 waiver disclaiming all interest in the software. This may sound harsh,
198 but unfortunately, it is the only way to ensure that the code you write
199 is legally yours to distribute.
202 @node Volunteering, Project Goals, Requirements, Top
203 @comment node-name, next, previous, up
204 @chapter Volunteering to Help
206 The GNU Classpath project needs volunteers to help us out. People are
207 needed to write unimplemented core packages, to test GNU Classpath on
208 free software programs written in the java programming language, to
209 test it on various platforms, and to port it to platforms that are
210 currently unsupported.
212 While pretty much all contributions are welcome (but see
213 @pxref{Requirements}) it is always preferable that volunteers do the
214 whole job when volunteering for a task. So when you volunteer to write
215 a Java package, please be willing to do the following:
219 Implement a complete drop-in replacement for the particular package.
220 That means implementing any ``internal'' classes. For example, in the
221 java.net package, there are non-public classes for implementing sockets.
222 Without those classes, the public socket interface is useless. But do
223 not feel obligated to completely implement all of the functionality at
224 once. For example, in the java.net package, there are different types
225 of protocol handlers for different types of URL's. Not all of these
226 need to be written at once.
229 Please write complete and thorough API documentation comments for
230 every public and protected method and variable. These should be
231 superior to Sun's and cover everything about the item being
235 Please write a regression test package that can be used to run tests
236 of your package's functionality. GNU Classpath uses the
237 @uref{http://sources.redhat.com/mauve/,Mauve project} for testing the
238 functionality of the core class libraries. The Classpath Project is
239 fast approaching the point in time where all modifications to the
240 source code repository will require appropriate test cases in Mauve to
241 ensure correctness and prevent regressions.
244 Writing good documentation, tests and fixing bugs should be every
245 developer's top priority in order to reach the elusive release of
248 @node Project Goals, Needed Tools and Libraries, Volunteering, Top
249 @comment node-name, next, previous, up
250 @chapter Project Goals
252 The goal of the Classpath project is to produce a
253 @uref{http://www.fsf.org/philosophy/free-sw.html,free} implementation of
254 the standard class library for Java. However, there are other more
255 specific goals as to which platforms should be supported.
257 Classpath is targeted to support the following operating systems:
261 Free operating systems. This includes GNU/Linux, GNU/Hurd, and the free
265 Other Unix-like operating systems.
268 Platforms which currently have no Java support at all.
271 Other platforms such as MS-Windows.
274 While free operating systems are the top priority, the other priorities
275 can shift depending on whether or not there is a volunteer to port
276 Classpath to those platforms and to test releases.
278 Eventually we hope the Classpath will support all JVM's that provide
279 JNI or CNI support. However, the top priority is free JVM's. A small
280 list of Compiler/VM environments that are currently actively
281 incorporating GNU Classpath is below. A more complete overview of
282 projects based on GNU classpath can be found online at
283 @uref{http://www.gnu.org/software/classpath/stories.html,the GNU
284 Classpath stories page}.
288 @uref{http://gcc.gnu.org/java/,GCJ}
290 @uref{http://jamvm.sourceforge.net/,jamvm}
292 @uref{http://kissme.sourceforge.net/,Kissme}
294 @uref{http://www.ibm.com/developerworks/oss/jikesrvm/,Jikes RVM}
296 @uref{http://www.sablevm.org/,SableVM}
298 @uref{http://www.kaffe.org/,Kaffe}
301 As with OS platform support, this priority list could change if a
302 volunteer comes forward to port, maintain, and test releases for a
303 particular JVM. Since gcj is part of the GNU Compiler Collective it
304 is one of the most important targets. But since it doesn't currently
305 work out of the box with GNU Classpath it is currently not the easiest
306 target. When hacking on GNU Classpath the easiest is to use
307 compilers and runtime environments that that work out of the box with
308 it, such as the jikes compiler and the runtime environments jamvm and
309 kissme. But you can also work directly with targets like gcj and
310 kaffe that have their own copy of GNU Classpath currently. In that
311 case changes have to be merged back into GNU Classpath proper though,
312 which is sometimes more work. SableVM is starting to migrate from an
313 integrated GNU Classpath version to being usable with GNU Classpath
317 The initial target version for Classpath is the 1.1 spec. Higher
318 versions can be implemented (and have been implemented, including lots
319 of 1.4 functionality) if desired, but please do not create classes
320 that depend on features in those packages unless GNU Classpath already
321 contains those features. GNU Classpath has been free of any
322 proprietary dependencies for a long time now and we like to keep it
323 that way. But finishing, polishing up, documenting, testing and
324 debugging current functionality is of higher priority then adding new
327 @node Needed Tools and Libraries, Programming Standards, Project Goals, Top
328 @comment node-name, next, previous, up
329 @chapter Needed Tools and Libraries
331 If you want to hack on Classpath, you should at least download and
332 install the following tools. And try to familiarize yourself with
333 them. Although in most cases having these tools installed will be all
334 you really need to know about them. Also note that when working on
335 (snapshot) releases only GCC 3.3+ (plus a free VM from the list above
336 and the libraries listed below) is needed. The other tools are only
337 needed when working directly on the CVS version.
356 All of these tools are available from
357 @uref{ftp://gnudist.gnu.org/pub/gnu/,gnudist.gnu.org} via anonymous
358 ftp, except CVS which is available from
359 @uref{http://www.cvshome.org/,www.cvshome.org}. They are fully
360 documented with texinfo manuals. Texinfo can be browsed with the
361 Emacs editor, or with the text editor of your choice, or transformed
362 into nicely printable Postscript.
364 Here is a brief description of the purpose of those tools.
369 The GNU Compiler Collection. This contains a C compiler (gcc) for
370 compiling the native C code and a compiler for the java programming
371 language (gcj). You will need at least gcj version 3.3 or higher. If
372 that version is not available for your platform you can try the
373 @uref{http://www.jikes.org/, jikes compiler}. We try to keep all code
374 compilable with both gcj and jikes at all times.
377 A version control system that maintains a centralized Internet
378 repository of all code in the Classpath system.
381 This tool automatically creates Makefile.in files from Makefile.am
382 files. The Makefile.in is turned into a Makefile by autoconf. Why
383 use this? Because it automatically generates every makefile target
384 you would ever want (clean, install, dist, etc) in full compliance
385 with the GNU coding standards. It also simplifies Makefile creation
386 in a number of ways that cannot be described here. Read the docs for
390 Automatically configures a package for the platform on which it is
391 being built and generates the Makefile for that platform.
394 Handles all of the zillions of hairy platform specific options needed
395 to build shared libraries.
398 The free GNU replacement for the standard Unix macro processor.
399 Proprietary m4 programs are broken and so GNU m4 is required for
400 autoconf to work though knowing a lot about GNU m4 is not required to
404 Larry Wall's scripting language. It is used internally by automake.
407 Manuals and documentation (like this guide) are written in texinfo.
408 Texinfo is the official documentation format of the GNU project.
409 Texinfo uses a single source file to produce output in a number of formats,
410 both online and printed (dvi, info, html, xml, etc.). This means that
411 instead of writing different documents for online information and another
412 for a printed manual, you need write only one document. And when the work
413 is revised, you need revise only that one document.
418 For compiling the native AWT libraries you need to have the following
423 @uref{http://www.gtk.org/,GTK+} is a multi-platform toolkit for
424 creating graphical user interfaces. It is used as the basis of the
425 GNU desktop project GNOME.
428 @uref{http://www.gnome.org/start/,gdk-pixbuf} is a GNOME library for
433 GNU Classpath comes with a couple of libraries included in the source
434 that are not part of GNU Classpath proper, but that have been included
435 to provide certain needed functionality. All these external libraries
436 should be clearly marked as such. In general we try to use as much as
437 possible the clean upstream versions of these sources. That way
438 merging in new versions will be easiest. You should always try to get
439 bug fixes to these files accepted upstream first. Currently we
440 include the following 'external' libraries. Most of these sources are
441 included in the @file{external} directory. That directory also
442 contains a @file{README} file explaining how to import newer versions.
447 Can be found in @file{external/jaxp}. Provides javax.xml, org.w3c and
448 org.xml packages. Upstream is
449 @uref{http://www.gnu.org/software/classpathx/,GNU ClasspathX}.
452 Can be found in @file{native/fdlibm}. Provides native implementations
453 of some of the Float and Double operations. Upstream is
454 @uref{http://gcc.gnu.org/java/,libgcj}, they sync again with the
455 'real' upstream @uref{http://www.netlib.org/fdlibm/readme}. See also
456 java.lang.StrictMath.
461 @node Programming Standards, Hacking Code, Needed Tools and Libraries, Top
462 @comment node-name, next, previous, up
463 @chapter Programming Standards
465 For C source code, follow the
466 @uref{http://www.gnu.org/prep/standards/,GNU Coding Standards}.
467 The standards also specify various things like the install directory
468 structure. These should be followed if possible.
470 For Java source code, please follow the
471 @uref{http://www.gnu.org/prep/standards/,GNU Coding
472 Standards}, as much as possible. There are a number of exceptions to
473 the GNU Coding Standards that we make for GNU Classpath as documented
474 in this guide. We will hopefully be providing developers with a code
475 formatting tool that closely matches those rules soon.
477 For API documentation comments, please follow
478 @uref{http://java.sun.com/products/jdk/javadoc/writingdoccomments.html,How
479 to Write Doc Comments for Javadoc}. We would like to have a set of
480 guidelines more tailored to GNU Classpath as part of this document.
483 * Source Code Style Guide::
486 @node Source Code Style Guide, , Programming Standards, Programming Standards
487 @comment node-name, next, previous, up
488 @section Java source coding style
490 Here is a list of some specific rules used when hacking on GNU
491 Classpath java source code. We try to follow the standard
492 @uref{http://www.gnu.org/prep/standards/,GNU Coding Standards}
493 for that. There are lots of tools that can automatically generate it
494 (although most tools assume C source, not java source code) and it
495 seems as good a standard as any. There are a couple of exceptions and
496 specific rules when hacking on GNU Classpath java source code however.
497 The following lists how code is formatted (and some other code
504 Java source files in GNU Classpath are encoded using UTF-8. However,
505 ordinarily it is considered best practice to use the ASCII subset of
506 UTF-8 and write non-ASCII characters using \u escapes.
509 If possible, generate specific imports (expand) over java.io.* type
510 imports. Order by gnu, java, javax, org. There must be one blank line
511 between each group. The imports themselves are ordered alphabetically by
512 package name. Classes and interfaces occur before sub-packages. The
513 classes/interfaces are then also sorted alphabetical. Note that uppercase
514 characters occur before lowercase characters.
517 import gnu.java.awt.EmbeddedWindow;
519 import java.io.IOException;
520 import java.io.InputStream;
522 import javax.swing.JFrame;
526 Blank line after package statement, last import statement, classes,
530 Opening/closing brace for class and method is at the same level of
531 indent as the declaration. All other braces are indented and content
532 between braces indented again.
535 Since method definitions don't start in column zero anyway (since they
536 are always inside a class definition), the rational for easy grepping
537 for ``^method_def'' is mostly gone already. Since it is customary for
538 almost everybody who writes java source code to put modifiers, return
539 value and method name on the same line, we do too.
541 @c fixme Another rational for always indenting the method definition is that itmakes it a bit easier to distinguish methods in inner and anonymousclasses from code in their enclosing context. NEED EXAMPLE.
544 Implements and extends on separate lines, throws too. Indent extends,
545 implements, throws. Apply deep indentation for method arguments.
547 @c fixme Needs example.
550 Don't add a space between a method or constructor call/definition and
551 the open-bracket. This is because often the return value is an object on
552 which you want to apply another method or from which you want to access
558 getToolkit ().createWindow (this);
563 getToolkit().createWindow(this);
567 The GNU Coding Standard it gives examples for almost every construct
568 (if, switch, do, while, etc.). One missing is the try-catch construct
569 which should be formatted as:
583 Wrap lines at 80 characters after assignments and before operators.
584 Wrap always before extends, implements, throws, and labels.
587 Don't put multiple class definitions in the same file, except for
588 inner classes. File names (plus .java) and class names should be the
592 Don't catch a @code{NullPointerException} as an alternative to simply
593 checking for @code{null}. It is clearer and usually more efficient
594 to simply write an explicit check.
596 For instance, don't write:
603 catch (NullPointerException _)
609 If your intent above is to check whether @samp{foo} is @code{null},
620 Don't use redundant modifiers or other redundant constructs. Here is
621 some sample code that shows various redundant items in comments:
624 /*import java.lang.Integer;*/
625 /*abstract*/ interface I @{
626 /*public abstract*/ void m();
627 /*public static final*/ int i = 1;
628 /*public static*/ class Inner @{@}
630 final class C /*extends Object*/ @{
631 /*final*/ void m() @{@}
635 Note that Jikes will generate warnings for redundant modifiers if you
636 use @code{+Predundant-modifiers} on the command line.
639 Modifiers should be listed in the standard order recommended by the
640 JLS. Jikes will warn for this when given @code{+Pmodifier-order}.
643 Because the output of different compilers differs, we have
644 standardized on explicitly specifying @code{serialVersionUID} in
645 @code{Serializable} classes in Classpath. This field should be
646 declared as @code{private static final}. Note that a class may be
647 @code{Serializable} without being explicitly marked as such, due to
648 inheritance. For instance, all subclasses of @code{Throwable} need to
649 have @code{serialVersionUID} declared.
651 @c fixme link to the discussion
654 Don't declare unchecked exceptions in the @code{throws} clause of a
655 method. However, if throwing an unchecked exception is part of the
656 method's API, you should mention it in the Javadoc. There is one
657 important exception to this rule, which is that a stub method should
658 be marked as throwing @code{gnu.classpath.NotImplementedException}.
659 This will let our API comparison tools note that the method is not
663 When overriding @code{Object.equals}, remember that @code{instanceof}
664 filters out @code{null}, so an explicit check is not needed.
667 When catching an exception and rethrowing a new exception you should
668 ``chain'' the Throwables. Don't just add the String representation of
669 the caught exception.
674 // Some code that can throw
676 catch (IOException ioe)
678 throw (SQLException) new SQLException("Database corrupt").setCause(ioe);
683 Avoid the use of reserved words for identifiers. This is obvious with those
684 such as @code{if} and @code{while} which have always been part of the Java
685 programming language, but you should be careful about accidentally using
686 words which have been added in later versions. Notable examples are
687 @code{assert} (added in 1.4) and @code{enum} (added in 1.5). Jikes will warn
688 of the use of the word @code{enum}, but, as it doesn't yet support the 1.5
689 version of the language, it will still allow this usage through. A
690 compiler which supports 1.5 (e.g. the Eclipse compiler, ecj) will simply
691 fail to compile the offending source code.
693 @c fixme Describe Anonymous classes (example).
694 @c fixme Descibe Naming conventions when different from GNU Coding Standards.
695 @c fixme Describee API doc javadoc tags used.
699 Some things are the same as in the normal GNU Coding Standards:
704 Unnecessary braces can be removed, one line after an if, for, while as
708 Space around operators (assignment, logical, relational, bitwise,
709 mathematical, shift).
712 Blank line before single-line comments, multi-line comments, javadoc
716 If more than 2 blank lines, trim to 2.
719 Don't keep commented out code. Just remove it or add a real comment
720 describing what it used to do and why it is changed to the current
725 @node Hacking Code, Programming Goals, Programming Standards, Top
726 @comment node-name, next, previous, up
727 @chapter Working on the code, Working with others
729 There are a lot of people helping out with GNU Classpath. Here are a
730 couple of practical guidelines to make working together on the code
733 The main thing is to always discuss what you are up to on the
734 mailinglist. Making sure that everybody knows who is working on what
735 is the most important thing to make sure we cooperate most
739 @uref{http://www.gnu.org/software/classpath/tasks.html,Task List}
740 which contains items that you might want to work on.
742 Before starting to work on something please make sure you read this
743 complete guide. And discuss it on list to make sure your work does
744 not duplicate or interferes with work someone else is already doing.
745 Always make sure that you submit things that are your own work. And
746 that you have paperwork on file (as stated in the requirements
747 section) with the FSF authorizing the use of your additions.
749 Technically the GNU Classpath project is hosted on
750 @uref{http://savannah.gnu.org/,Savannah} a central point for
751 development, distribution and maintenance of GNU Software. Here you
753 @uref{https://savannah.gnu.org/projects/classpath/,project page}, bug
754 reports, pending patches, links to mailing lists, news items and CVS.
756 You can find instructions on getting a CVS checkout for classpath at
757 @uref{https://savannah.gnu.org/cvs/?group=classpath}.
759 You don't have to get CVS commit write access to contribute, but it is
760 sometimes more convenient to be able to add your changes directly to
761 the project CVS. Please contact the GNU Classpath savannah admins to
762 arrange CVS access if you would like to have it.
764 Make sure to be subscribed to the commit-classpath mailinglist while
765 you are actively hacking on Classpath. You have to send patches (cvs
766 diff -uN) to this list before committing.
768 We really want to have a pretty open check-in policy. But this means
769 that you should be extra careful if you check something in. If at all
770 in doubt or if you think that something might need extra explaining
771 since it is not completely obvious please make a little announcement
772 about the change on the mailinglist. And if you do commit something
773 without discussing it first and another GNU Classpath hackers asks for
774 extra explanation or suggests to revert a certain commit then please
775 reply to the request by explaining why something should be so or if
776 you agree to revert it. (Just reverting immediately is OK without
777 discussion, but then please don't mix it with other changes and please
780 Patches that are already approved for libgcj or also OK for Classpath.
781 (But you still have to send a patch/diff to the list.) All other
782 patches require you to think whether or not they are really OK and
783 non-controversial, or if you would like some feedback first on them
784 before committing. We might get real commit rules in the future, for
785 now use your own judgment, but be a bit conservative.
787 Always contact the GNU Classpath maintainer before adding anything
788 non-trivial that you didn't write yourself and that does not come from
789 libgcj or from another known GNU Classpath or libgcj hacker. If you
790 have been assigned to commit changes on behalf of another project or
791 a company always make sure they come from people who have signed the
792 papers for the FSF and/or fall under the arrangement your company made
793 with the FSF for contributions. Mention in the ChangeLog who actually
796 Commits for completely unrelated changes they should be committed
797 separately (especially when doing a formatting change and a logical
798 change, do them in two separate commits). But do try to do a commit of
799 as much things/files that are done at the same time which can
800 logically be seen as part of the same change/cleanup etc.
802 When the change fixes an important bug or adds nice new functionality
803 please write a short entry for inclusion in the @file{NEWS} file. If it
804 changes the VM interface you must mention that in both the @file{NEWS} file
805 and the VM Integration Guide.
807 All the ``rules'' are really meant to make sure that GNU Classpath
808 will be maintainable in the long run and to give all the projects that
809 are now using GNU Classpath an accurate view of the changes we make to
810 the code and to see what changed when. If you think the requirements
811 are ``unworkable'' please try it first for a couple of weeks. If you
812 still feel the same after having some more experience with the project
813 please feel free to bring up suggestions for improvements on the list.
814 But don't just ignore the rules! Other hackers depend on them being
815 followed to be the most productive they can be (given the above
820 * Writing ChangeLogs::
823 @node Branches, Writing ChangeLogs, Hacking Code, Hacking Code
824 @comment node-name, next, previous, up
825 @section Working with branches
827 Sometimes it is necessary to create branch of the source for doing new
828 work that is disruptive to the other hackers, or that needs new
829 language or libraries not yet (easily) available.
831 After discussing the need for a branch on the main mailinglist with
832 the other hackers explaining the need of a branch and suggestion of
833 the particular branch rules (what will be done on the branch, who will
834 work on it, will there be different commit guidelines then for the
835 mainline trunk and when is the branch estimated to be finished and
836 merged back into the trunk) every GNU Classpath hacker with commit
837 access should feel free to create a branch. There are however a couple
838 of rules that every branch should follow:
842 @item All branches ought to be documented in the developer wiki at
843 @uref{http://developer.classpath.org/mediation/ClasspathBranches}, so
844 we can know which are live, who owns them, and when they die.
846 @item Some rules can be changed on a branch. In particular the branch
847 maintainer can change the review requirements, and the requirement of
848 keeping things building, testing, etc, can also be lifted. (These
849 should be documented along with the branch name and owner if they
850 differ from the trunk.)
852 @item Requirements for patch email to classpath-patches and for paperwork
853 @strong{cannot} be lifted. See @ref{Requirements}.
855 @item A branch should not be seen as ``private'' or
856 ``may be completely broken''. It should be as much as possible
857 something that you work on with a team (and if there is no team - yet
858 - then there is nothing as bad as having a completely broken build to
859 get others to help out). There can of course be occasional breakage, but
860 it should be planned and explained. And you can certainly have a rule
861 like ``please ask me before committing to this branch''.
863 @item Merges from the trunk to a branch are at the discretion of the
866 @item A merge from a branch to the trunk is treated like any other patch.
867 In particular, it has to go through review, it must satisfy all the
868 trunk requirements (build, regression test, documentation).
870 @item There may be additional timing requirements on merging a branch to
871 the trunk depending on the release schedule, etc. For instance we may
872 not want to do a branch merge just before a release.
876 If any of these rules are unclear please discuss on the list first.
879 * Writing ChangeLogs::
882 @node Writing ChangeLogs, , Branches, Hacking Code
883 @comment node-name, next, previous, up
884 @section Documenting what changed when with ChangeLog entries
886 To keep track of who did what when we keep an explicit ChangeLog entry
887 together with the code. This mirrors the CVS commit messages and in
888 general the ChangeLog entry is the same as the CVS commit message.
889 This provides an easy way for people getting a (snapshot) release or
890 without access to the CVS server to see what happened when. We do not
891 generate the ChangeLog file automatically from the CVS server since
892 that is not reliable.
894 A good ChangeLog entry guideline can be found in the Guile Manual at
895 @uref{http://www.gnu.org/software/guile/changelogs/guile-changelogs_3.html}.
897 Here are some example to explain what should or shouldn't be in a
898 ChangeLog entry (and the corresponding commit message):
903 The first line of a ChangeLog entry should be:
906 [date] <two spaces> [full name] <two spaces> [email-contact]
909 The second line should be blank. All other lines should be indented
913 Just state what was changed. Why something is done as it is done in
914 the current code should be either stated in the code itself or be
915 added to one of the documentation files (like this Hacking Guide).
920 * java/awt/font/OpenType.java: Remove 'public static final'
921 from OpenType tags, reverting the change of 2003-08-11. See
922 Classpath discussion list of 2003-08-11.
928 * java/awt/font/OpenType.java: Remove 'public static final' from
932 In this case the reason for the change was added to this guide.
935 Just as with the normal code style guide, don't make lines longer then
939 Just as with comments in the code. The ChangeLog entry should be a
940 full sentence, starting with a captital and ending with a period.
943 Be precise in what changed, not the effect of the change (which should
944 be clear from the code/patch). So don't write:
947 * java/io/ObjectOutputStream.java : Allow putFields be called more
951 But explain what changed and in which methods it was changed:
954 * java/io/ObjectOutputStream.java (putFields): Don't call
955 markFieldsWritten(). Only create new PutField when
956 currentPutField is null.
957 (writeFields): Call markFieldsWritten().
962 The above are all just guidelines. We all appreciate the fact that writing
963 ChangeLog entries, using a coding style that is not ``your own'' and the
964 CVS, patch and diff tools do take some time to getting used to. So don't
965 feel like you have to do it perfect right away or that contributions
966 aren't welcome if they aren't ``perfect''. We all learn by doing and
967 interacting with each other.
970 @node Programming Goals, API Compatibility, Hacking Code, Top
971 @comment node-name, next, previous, up
972 @chapter Programming Goals
974 When you write code for Classpath, write with three things in mind, and
975 in the following order: portability, robustness, and efficiency.
977 If efficiency breaks portability or robustness, then don't do it the
978 efficient way. If robustness breaks portability, then bye-bye robust
979 code. Of course, as a programmer you would probably like to find sneaky
980 ways to get around the issue so that your code can be all three ... the
981 following chapters will give some hints on how to do this.
984 * Portability:: Writing Portable Software
985 * Utility Classes:: Reusing Software
986 * Robustness:: Writing Robust Software
987 * Java Efficiency:: Writing Efficient Java
988 * Native Efficiency:: Writing Efficient JNI
989 * Security:: Writing Secure Software
992 @node Portability, Utility Classes, Programming Goals, Programming Goals
993 @comment node-name, next, previous, up
996 The portability goal for Classpath is the following:
1000 native functions for each platform that work across all VMs on that
1003 a single classfile set that work across all VMs on all platforms that
1004 support the native functions.
1007 For almost all of Classpath, this is a very feasible goal, using a
1008 combination of JNI and native interfaces. This is what you should shoot
1009 for. For those few places that require knowledge of the Virtual Machine
1010 beyond that provided by the Java standards, the VM Interface was designed.
1011 Read the Virtual Machine Integration Guide for more information.
1013 Right now the only supported platform is Linux. This will change as that
1014 version stabilizes and we begin the effort to port to many other
1015 platforms. Jikes RVM runs Classpath on AIX, and generally the Jikes
1016 RVM team fixes Classpath to work on that platform.
1018 @node Utility Classes, Robustness, Portability, Programming Goals
1019 @comment node-name, next, previous, up
1020 @section Utility Classes
1022 At the moment, we are not very good at reuse of the JNI code. There
1023 have been some attempts, called @dfn{libclasspath}, to
1024 create generally useful utility classes. The utility classes are in
1025 the directory @file{native/jni/classpath} and they are mostly declared
1026 in @file{native/jni/classpath/jcl.h}. These utility classes are
1027 currently only discussed in @ref{Robustness} and in @ref{Native
1030 There are more utility classes available that could be factored out if
1031 a volunteer wants something nice to hack on. The error reporting and
1032 exception throwing functions and macros in
1033 @file{native/jni/gtk-peer/gthread-jni.c} might be good
1034 candidates for reuse. There are also some generally useful utility
1035 functions in @file{gnu_java_awt_peer_gtk_GtkMainThread.c} that could
1036 be split out and put into libclasspath.
1038 @node Robustness, Java Efficiency, Utility Classes, Programming Goals
1039 @comment node-name, next, previous, up
1042 Native code is very easy to make non-robust. (That's one reason Java is
1043 so much better!) Here are a few hints to make your native code more
1046 Always check return values for standard functions. It's sometimes easy
1047 to forget to check that malloc() return for an error. Don't make that
1048 mistake. (In fact, use JCL_malloc() in the jcl library instead--it will
1049 check the return value and throw an exception if necessary.)
1051 Always check the return values of JNI functions, or call
1052 @code{ExceptionOccurred} to check whether an error occurred. You must
1053 do this after @emph{every} JNI call. JNI does not work well when an
1054 exception has been raised, and can have unpredictable behavior.
1056 Throw exceptions using @code{JCL_ThrowException}. This guarantees that if
1057 something is seriously wrong, the exception text will at least get out
1058 somewhere (even if it is stderr).
1060 Check for null values of @code{jclass}es before you send them to JNI functions.
1061 JNI does not behave nicely when you pass a null class to it: it
1062 terminates Java with a "JNI Panic."
1064 In general, try to use functions in @file{native/jni/classpath/jcl.h}. They
1065 check exceptions and return values and throw appropriate exceptions.
1067 @node Java Efficiency, Native Efficiency, Robustness, Programming Goals
1068 @comment node-name, next, previous, up
1069 @section Java Efficiency
1071 For methods which explicitly throw a @code{NullPointerException} when an
1072 argument is passed which is null, per a Sun specification, do not write
1077 strlen (String foo) throws NullPointerException
1080 throw new NullPointerException ("foo is null");
1081 return foo.length ();
1085 Instead, the code should be written as:
1089 strlen (String foo) throws NullPointerException
1091 return foo.length ();
1095 Explicitly comparing foo to null is unnecessary, as the virtual machine
1096 will throw a NullPointerException when length() is invoked. Classpath
1097 is designed to be as fast as possible -- every optimization, no matter
1098 how small, is important.
1100 @node Native Efficiency, Security, Java Efficiency, Programming Goals
1101 @comment node-name, next, previous, up
1102 @section Native Efficiency
1104 You might think that using native methods all over the place would give
1105 our implementation of Java speed, speed, blinding speed. You'd be
1106 thinking wrong. Would you believe me if I told you that an empty
1107 @emph{interpreted} Java method is typically about three and a half times
1108 @emph{faster} than the equivalent native method?
1110 Bottom line: JNI is overhead incarnate. In Sun's implementation, even
1111 the JNI functions you use once you get into Java are slow.
1113 A final problem is efficiency of native code when it comes to things
1114 like method calls, fields, finding classes, etc. Generally you should
1115 cache things like that in static C variables if you're going to use them
1116 over and over again. GetMethodID(), GetFieldID(), and FindClass() are
1117 @emph{slow}. Classpath provides utility libraries for caching methodIDs
1118 and fieldIDs in @file{native/jni/classpath/jnilink.h}. Other native data can
1119 be cached between method calls using functions found in
1120 @file{native/jni/classpath/native_state.h}.
1122 Here are a few tips on writing native code efficiently:
1124 Make as few native method calls as possible. Note that this is not the
1125 same thing as doing less in native method calls; it just means that, if
1126 given the choice between calling two native methods and writing a single
1127 native method that does the job of both, it will usually be better to
1128 write the single native method. You can even call the other two native
1129 methods directly from your native code and not incur the overhead of a
1130 method call from Java to C.
1132 Cache @code{jmethodID}s and @code{jfieldID}s wherever you can. String
1134 expensive. The best way to do this is to use the
1135 @file{native/jni/classpath/jnilink.h}
1136 library. It will ensure that @code{jmethodID}s are always valid, even if the
1137 class is unloaded at some point. In 1.1, jnilink simply caches a
1138 @code{NewGlobalRef()} to the method's underlying class; however, when 1.2 comes
1139 along, it will use a weak reference to allow the class to be unloaded
1140 and then re-resolve the @code{jmethodID} the next time it is used.
1142 Cache classes that you need to access often. jnilink will help with
1143 this as well. The issue here is the same as the methodID and fieldID
1144 issue--how to make certain the class reference remains valid.
1146 If you need to associate native C data with your class, use Paul
1147 Fisher's native_state library (NSA). It will allow you to get and set
1148 state fairly efficiently. Japhar now supports this library, making
1149 native state get and set calls as fast as accessing a C variable
1152 If you are using native libraries defined outside of Classpath, then
1153 these should be wrapped by a Classpath function instead and defined
1154 within a library of their own. This makes porting Classpath's native
1155 libraries to new platforms easier in the long run. It would be nice
1156 to be able to use Mozilla's NSPR or Apache's APR, as these libraries
1157 are already ported to numerous systems and provide all the necessary
1158 system functions as well.
1160 @node Security, , Native Efficiency, Programming Goals
1161 @comment node-name, next, previous, up
1164 Security is such a huge topic it probably deserves its own chapter.
1165 Most of the current code needs to be audited for security to ensure
1166 all of the proper security checks are in place within the Java
1167 platform, but also to verify that native code is reasonably secure and
1168 avoids common pitfalls, buffer overflows, etc. A good source for
1169 information on secure programming is the excellent HOWTO by David
1171 @uref{http://www.dwheeler.com/secure-programs/Secure-Programs-HOWTO/index.html,Secure
1172 Programming for Linux and Unix HOWTO}.
1174 @node API Compatibility, Specification Sources, Programming Goals, Top
1175 @comment node-name, next, previous, up
1176 @chapter API Compatibility
1179 * Serialization:: Serialization
1180 * Deprecated Methods:: Deprecated methods
1183 @node Serialization, Deprecated Methods, API Compatibility, API Compatibility
1184 @comment node-name, next, previous, up
1185 @section Serialization
1187 Sun has produced documentation concerning much of the information
1188 needed to make Classpath serializable compatible with Sun
1189 implementations. Part of doing this is to make sure that every class
1190 that is Serializable actually defines a field named serialVersionUID
1191 with a value that matches the output of serialver on Sun's
1192 implementation. The reason for doing this is below.
1194 If a class has a field (of any accessibility) named serialVersionUID
1195 of type long, that is what serialver uses. Otherwise it computes a
1196 value using some sort of hash function on the names of all method
1197 signatures in the .class file. The fact that different compilers
1198 create different synthetic method signatures, such as access$0() if an
1199 inner class needs access to a private member of an enclosing class,
1200 make it impossible for two distinct compilers to reliably generate the
1201 same serial #, because their .class files differ. However, once you
1202 have a .class file, its serial # is unique, and the computation will
1203 give the same result no matter what platform you execute on.
1205 Serialization compatibility can be tested using tools provided with
1206 @uref{http://www.kaffe.org/~stuart/japi/,Japitools}. These
1207 tools can test binary serialization compatibility and also provide
1208 information about unknown serialized formats by writing these in XML
1209 instead. Japitools is also the primary means of checking API
1210 compatibility for GNU Classpath with Sun's Java Platform.
1212 @node Deprecated Methods, , Serialization, API Compatibility
1213 @comment node-name, next, previous, up
1214 @section Deprecated Methods
1216 Sun has a practice of creating ``alias'' methods, where a public or
1217 protected method is deprecated in favor of a new one that has the same
1218 function but a different name. Sun's reasons for doing this vary; as
1219 an example, the original name may contain a spelling error or it may
1220 not follow Java naming conventions.
1222 Unfortunately, this practice complicates class library code that calls
1223 these aliased methods. Library code must still call the deprecated
1224 method so that old client code that overrides it continues to work.
1225 But library code must also call the new version, because new code is
1226 expected to override the new method.
1228 The correct way to handle this (and the way Sun does it) may seem
1229 counterintuitive because it means that new code is less efficient than
1230 old code: the new method must call the deprecated method, and throughout
1231 the library code calls to the old method must be replaced with calls to
1234 Take the example of a newly-written container laying out a component and
1235 wanting to know its preferred size. The Component class has a
1236 deprecated preferredSize method and a new method, getPreferredSize.
1237 Assume that the container is laying out an old component that overrides
1238 preferredSize and a new component that overrides getPreferredSize. If
1239 the container calls getPreferredSize and the default implementation of
1240 getPreferredSize calls preferredSize, then the old component will have
1241 its preferredSize method called and new code will have its
1242 getPreferredSize method called.
1244 Even using this calling scheme, an old component may still be laid out
1245 improperly if it implements a method, getPreferredSize, that has the
1246 same signature as the new Component.getPreferredSize. But that is a
1247 general problem -- adding new public or protected methods to a
1248 widely-used class that calls those methods internally is risky, because
1249 existing client code may have already declared methods with the same
1252 The solution may still seem counterintuitive -- why not have the
1253 deprecated method call the new method, then have the library always call
1254 the old method? One problem with that, using the preferred size example
1255 again, is that new containers, which will use the non-deprecated
1256 getPreferredSize, will not get the preferred size of old components.
1258 @node Specification Sources, Naming Conventions, API Compatibility, Top
1259 @comment node-name, next, previous, up
1260 @chapter Specification Sources
1262 There are a number of specification sources to use when working on
1263 Classpath. In general, the only place you'll find your classes
1264 specified is in the JavaDoc documentation or possibly in the
1265 corresponding white paper. In the case of java.lang, java.io and
1266 java.util, you should look at the Java Language Specification.
1268 Here, however, is a list of specs, in order of canonicality:
1272 @uref{http://java.sun.com/docs/books/jls/clarify.html,Clarifications and Amendments to the JLS - 1.1}
1274 @uref{http://java.sun.com/docs/books/jls/html/1.1Update.html,JLS Updates
1277 @uref{http://java.sun.com/docs/books/jls/html/index.html,The 1.0 JLS}
1279 @uref{http://java.sun.com/docs/books/vmspec/index.html,JVM spec - 1.1}
1281 @uref{http://java.sun.com/products/jdk/1.1/docs/guide/jni/spec/jniTOC.doc.html,JNI spec - 1.1}
1283 @uref{http://java.sun.com/products/jdk/1.1/docs/api/packages.html,Sun's javadoc - 1.1}
1284 (since Sun's is the reference implementation, the javadoc is
1285 documentation for the Java platform itself.)
1287 @uref{http://java.sun.com/products/jdk/1.2/docs/guide/jvmdi/jvmdi.html,JVMDI spec - 1.2},
1288 @uref{http://java.sun.com/products/jdk/1.2/docs/guide/jni/jni-12.html,JNI spec - 1.2}
1289 (sometimes gives clues about unspecified things in 1.1; if
1290 it was not specified accurately in 1.1, then use the spec
1291 for 1.2; also, we are using JVMDI in this project.)
1293 @uref{http://java.sun.com/products/jdk/1.2/docs/api/frame.html,Sun's javadoc - 1.2}
1294 (sometimes gives clues about unspecified things in 1.1; if
1295 it was not specified accurately in 1.1, then use the spec
1298 @uref{http://developer.java.sun.com/developer/bugParade/index.html,The
1299 Bug Parade}: I have obtained a ton of useful information about how
1300 things do work and how they *should* work from the Bug Parade just by
1301 searching for related bugs. The submitters are very careful about their
1302 use of the spec. And if something is unspecified, usually you can find
1303 a request for specification or a response indicating how Sun thinks it
1304 should be specified here.
1307 You'll notice that in this document, white papers and specification
1308 papers are more canonical than the JavaDoc documentation. This is true
1312 @node Naming Conventions, Character Conversions, Specification Sources, Top
1313 @comment node-name, next, previous, up
1314 @chapter Directory and File Naming Conventions
1316 The Classpath directory structure is laid out in the following manner:
1358 Here is a brief description of the toplevel directories and their contents.
1363 Contains the source code to the Java packages that make up the core
1364 class library. Because this is the public interface to Java, it is
1365 important that the public classes, interfaces, methods, and variables
1366 are exactly the same as specified in Sun's documentation. The directory
1367 structure is laid out just like the java package names. For example,
1368 the class java.util.zip would be in the directory java-util.
1371 Internal classes (roughly analogous to Sun's sun.* classes) should go
1372 under the @file{gnu/java} directory. Classes related to a particular public
1373 Java package should go in a directory named like that package. For
1374 example, classes related to java.util.zip should go under a directory
1375 @file{gnu/java/util/zip}. Sub-packages under the main package name are
1376 allowed. For classes spanning multiple public Java packages, pick an
1377 appropriate name and see what everybody else thinks.
1380 This directory holds native code needed by the public Java packages.
1381 Each package has its own subdirectory, which is the ``flattened'' name
1382 of the package. For example, native method implementations for
1383 java.util.zip should go in @file{native/classpath/java-util}. Classpath
1384 actually includes an all Java version of the zip classes, so no native
1389 Each person working on a package get's his or her own ``directory
1390 space'' underneath each of the toplevel directories. In addition to the
1391 general guidelines above, the following standards should be followed:
1396 Classes that need to load native code should load a library with the
1397 same name as the flattened package name, with all hyphens removed. For
1398 example, the native library name specified in LoadLibrary for
1399 java-util would be ``javautil''.
1402 Each package has its own shared library for native code (if any).
1405 The main native method implementation for a given method in class should
1406 go in a file with the same name as the class with a ``.c'' extension.
1407 For example, the JNI implementation of the native methods in
1408 java.net.InetAddress would go in @file{native/jni/java-net/InetAddress.c}.
1409 ``Internal'' native functions called from the main native method can
1410 reside in files of any name.
1413 @node Character Conversions, Localization, Naming Conventions, Top
1414 @comment node-name, next, previous, up
1415 @chapter Character Conversions
1417 Java uses the Unicode character encoding system internally. This is a
1418 sixteen bit (two byte) collection of characters encompassing most of the
1419 world's written languages. However, Java programs must often deal with
1420 outside interfaces that are byte (eight bit) oriented. For example, a
1421 Unix file, a stream of data from a network socket, etc. Beginning with
1422 Java 1.1, the @code{Reader} and @code{Writer} classes provide functionality
1423 for dealing with character oriented streams. The classes
1424 @code{InputStreamReader} and @code{OutputStreamWriter} bridge the gap
1425 between byte streams and character streams by converting bytes to
1426 Unicode characters and vice versa.
1428 In Classpath, @code{InputStreamReader} and @code{OutputStreamWriter}
1429 rely on an internal class called @code{gnu.java.io.EncodingManager} to load
1430 translaters that perform the actual conversion. There are two types of
1431 converters, encoders and decoders. Encoders are subclasses of
1432 @code{gnu.java.io.encoder.Encoder}. This type of converter takes a Java
1433 (Unicode) character stream or buffer and converts it to bytes using
1434 a specified encoding scheme. Decoders are a subclass of
1435 @code{gnu.java.io.decoder.Decoder}. This type of converter takes a
1436 byte stream or buffer and converts it to Unicode characters. The
1437 @code{Encoder} and @code{Decoder} classes are subclasses of
1438 @code{Writer} and @code{Reader} respectively, and so can be used in
1439 contexts that require character streams, but the Classpath implementation
1440 currently does not make use of them in this fashion.
1442 The @code{EncodingManager} class searches for requested encoders and
1443 decoders by name. Since encoders and decoders are separate in Classpath,
1444 it is possible to have a decoder without an encoder for a particular
1445 encoding scheme, or vice versa. @code{EncodingManager} searches the
1446 package path specified by the @code{file.encoding.pkg} property. The
1447 name of the encoder or decoder is appended to the search path to
1448 produce the required class name. Note that @code{EncodingManager} knows
1449 about the default system encoding scheme, which it retrieves from the
1450 system property @code{file.encoding}, and it will return the proper
1451 translator for the default encoding if no scheme is specified. Also, the
1452 Classpath standard translator library, which is the @code{gnu.java.io} package,
1453 is automatically appended to the end of the path.
1455 For efficiency, @code{EncodingManager} maintains a cache of translators
1456 that it has loaded. This eliminates the need to search for a commonly
1457 used translator each time it is requested.
1459 Finally, @code{EncodingManager} supports aliasing of encoding scheme names.
1460 For example, the ISO Latin-1 encoding scheme can be referred to as
1461 ''8859_1'' or ''ISO-8859-1''. @code{EncodingManager} searches for
1462 aliases by looking for the existence of a system property called
1463 @code{gnu.java.io.encoding_scheme_alias.<encoding name>}. If such a
1464 property exists. The value of that property is assumed to be the
1465 canonical name of the encoding scheme, and a translator with that name is
1466 looked up instead of one with the original name.
1468 Here is an example of how @code{EncodingManager} works. A class requests
1469 a decoder for the ''UTF-8'' encoding scheme by calling
1470 @code{EncodingManager.getDecoder("UTF-8")}. First, an alias is searched
1471 for by looking for the system property
1472 @code{gnu.java.io.encoding_scheme_alias.UTF-8}. In our example, this
1473 property exists and has the value ''UTF8''. That is the actual
1474 decoder that will be searched for. Next, @code{EncodingManager} looks
1475 in its cache for this translator. Assuming it does not find it, it
1476 searches the translator path, which is this example consists only of
1477 the default @code{gnu.java.io}. The ''decoder'' package name is
1478 appended since we are looking for a decoder. (''encoder'' would be
1479 used if we were looking for an encoder). Then name name of the translator
1480 is appended. So @code{EncodingManager} attempts to load a translator
1481 class called @code{gnu.java.io.decoder.UTF8}. If that class is found,
1482 an instance of it is returned. If it is not found, a
1483 @code{UnsupportedEncodingException}.
1485 To write a new translator, it is only necessary to subclass
1486 @code{Encoder} and/or @code{Decoder}. Only a handful of abstract
1487 methods need to be implemented. In general, no methods need to be
1488 overridden. The needed methods calculate the number of bytes/chars
1489 that the translation will generate, convert buffers to/from bytes,
1490 and read/write a requested number of characters to/from a stream.
1492 Many common encoding schemes use only eight bits to encode characters.
1493 Writing a translator for these encodings is very easy. There are
1494 abstract translator classes @code{gnu.java.io.decode.DecoderEightBitLookup}
1495 and @code{gnu.java.io.encode.EncoderEightBitLookup}. These classes
1496 implement all of the necessary methods. All that is necessary to
1497 create a lookup table array that maps bytes to Unicode characters and
1498 set the class variable @code{lookup_table} equal to it in a static
1499 initializer. Also, a single constructor that takes an appropriate
1500 stream as an argument must be supplied. These translators are
1501 exceptionally easy to create and there are several of them supplied
1502 in the Classpath distribution.
1504 Writing multi-byte or variable-byte encodings is more difficult, but
1505 often not especially challenging. The Classpath distribution ships with
1506 translators for the UTF8 encoding scheme which uses from one to three
1507 bytes to encode Unicode characters. This can serve as an example of
1508 how to write such a translator.
1510 Many more translators are needed. All major character encodings should
1511 eventually be supported.
1513 @node Localization, , Character Conversions, Top
1514 @comment node-name, next, previous, up
1515 @chapter Localization
1517 There are many parts of the Java standard runtime library that must
1518 be customized to the particular locale the program is being run in.
1519 These include the parsing and display of dates, times, and numbers;
1520 sorting words alphabetically; breaking sentences into words, etc.
1521 In general, Classpath uses general classes for performing these tasks,
1522 and customizes their behavior with configuration data specific to a
1526 * String Collation:: Sorting strings in different locales
1527 * Break Iteration:: Breaking up text into words, sentences, and lines
1528 * Date Formatting and Parsing:: Locale specific date handling
1529 * Decimal/Currency Formatting and Parsing:: Local specific number handling
1532 In Classpath, all locale specific data is stored in a
1533 @code{ListResourceBundle} class in the package @code{gnu/java/locale}.
1534 The basename of the bundle is @code{LocaleInformation}. See the
1535 documentation for the @code{java.util.ResourceBundle} class for details
1536 on how the specific locale classes should be named.
1538 @code{ListResourceBundle}'s are used instead of
1539 @code{PropertyResourceBundle}'s because data more complex than simple
1540 strings need to be provided to configure certain Classpath components.
1541 Because @code{ListResourceBundle} allows an arbitrary Java object to
1542 be associated with a given configuration option, it provides the
1543 needed flexibility to accomodate Classpath's needs.
1545 Each Java library component that can be localized requires that certain
1546 configuration options be specified in the resource bundle for it. It is
1547 important that each and every option be supplied for a specific
1548 component or a critical runtime error will most likely result.
1550 As a standard, each option should be assigned a name that is a string.
1551 If the value is stored in a class or instance variable, then the option
1552 should name should have the name name as the variable. Also, the value
1553 associated with each option should be a Java object with the same name
1554 as the option name (unless a simple scalar value is used). Here is an
1557 A class loads a value for the @code{format_string} variable from the
1558 resource bundle in the specified locale. Here is the code in the
1562 ListResourceBundle lrb =
1563 ListResourceBundle.getBundle ("gnu/java/locale/LocaleInformation", locale);
1564 String format_string = lrb.getString ("format_string");
1567 In the actual resource bundle class, here is how the configuration option
1572 * This is the format string used for displaying values
1574 private static final String format_string = "%s %d %i";
1576 private static final Object[][] contents =
1578 @{ "format_string", format_string @}
1582 Note that each variable should be @code{private}, @code{final}, and
1583 @code{static}. Each variable should also have a description of what it
1584 does as a documentation comment. The @code{getContents()} method returns
1585 the @code{contents} array.
1587 There are many functional areas of the standard class library that are
1588 configured using this mechanism. A given locale does not need to support
1589 each functional area. But if a functional area is supported, then all
1590 of the specified entries for that area must be supplied. In order to
1591 determine which functional areas are supported, there is a special key
1592 that is queried by the affected class or classes. If this key exists,
1593 and has a value that is a @code{Boolean} object wrappering the
1594 @code{true} value, then full support is assumed. Otherwise it is
1595 assumed that no support exists for this functional area. Every class
1596 using resources for configuration must use this scheme and define a special
1597 scheme that indicates the functional area is supported. Simply checking
1598 for the resource bundle's existence is not sufficient to ensure that a
1599 given functional area is supported.
1601 The following sections define the functional areas that use resources
1602 for locale specific configuration in GNU Classpath. Please refer to the
1603 documentation for the classes mentioned for details on how these values
1604 are used. You may also wish to look at the source file for
1605 @file{gnu/java/locale/LocaleInformation_en} as an example.
1607 @node String Collation, Break Iteration, Localization, Localization
1608 @comment node-name, next, previous, up
1609 @section String Collation
1611 Collation involves the sorting of strings. The Java class library provides
1612 a public class called @code{java.text.RuleBasedCollator} that performs
1613 sorting based on a set of sorting rules.
1616 @item RuleBasedCollator - A @code{Boolean} wrappering @code{true} to indicate
1617 that this functional area is supported.
1618 @item collation_rules - The rules the specify how string collation is to
1622 Note that some languages might be too complex for @code{RuleBasedCollator}
1623 to handle. In this case an entirely new class might need to be written in
1624 lieu of defining this rule string.
1626 @node Break Iteration, Date Formatting and Parsing, String Collation, Localization
1627 @comment node-name, next, previous, up
1628 @section Break Iteration
1630 The class @code{java.text.BreakIterator} breaks text into words, sentences,
1631 and lines. It is configured with the following resource bundle entries:
1634 @item BreakIterator - A @code{Boolean} wrappering @code{true} to indicate
1635 that this functional area is supported.
1636 @item word_breaks - A @code{String} array of word break character sequences.
1637 @item sentence_breaks - A @code{String} array of sentence break character
1639 @item line_breaks - A @code{String} array of line break character sequences.
1642 @node Date Formatting and Parsing, Decimal/Currency Formatting and Parsing, Break Iteration, Localization
1643 @comment node-name, next, previous, up
1644 @section Date Formatting and Parsing
1646 Date formatting and parsing is handled by the
1647 @code{java.text.SimpleDateFormat} class in most locales. This class is
1648 configured by attaching an instance of the @code{java.text.DateFormatSymbols}
1649 class. That class simply reads properties from our locale specific
1650 resource bundle. The following items are required (refer to the
1651 documentation of the @code{java.text.DateFormatSymbols} class for details
1652 io what the actual values should be):
1655 @item DateFormatSymbols - A @code{Boolean} wrappering @code{true} to indicate
1656 that this functional area is supported.
1657 @item months - A @code{String} array of month names.
1658 @item shortMonths - A @code{String} array of abbreviated month names.
1659 @item weekdays - A @code{String} array of weekday names.
1660 @item shortWeekdays - A @code{String} array of abbreviated weekday names.
1661 @item ampms - A @code{String} array containing AM/PM names.
1662 @item eras - A @code{String} array containing era (ie, BC/AD) names.
1663 @item zoneStrings - An array of information about valid timezones for this
1665 @item localPatternChars - A @code{String} defining date/time pattern symbols.
1666 @item shortDateFormat - The format string for dates used by
1667 @code{DateFormat.SHORT}
1668 @item mediumDateFormat - The format string for dates used by
1669 @code{DateFormat.MEDIUM}
1670 @item longDateFormat - The format string for dates used by
1671 @code{DateFormat.LONG}
1672 @item fullDateFormat - The format string for dates used by
1673 @code{DateFormat.FULL}
1674 @item shortTimeFormat - The format string for times used by
1675 @code{DateFormat.SHORT}
1676 @item mediumTimeFormat - The format string for times used by
1677 @code{DateFormat.MEDIUM}
1678 @item longTimeFormat - The format string for times used by
1679 @code{DateFormat.LONG}
1680 @item fullTimeFormat - The format string for times used by
1681 @code{DateFormat.FULL}
1684 Note that it may not be possible to use this mechanism for all locales.
1685 In those cases a special purpose class may need to be written to handle
1686 date/time processing.
1688 @node Decimal/Currency Formatting and Parsing, , Date Formatting and Parsing, Localization
1689 @comment node-name, next, previous, up
1690 @section Decimal/Currency Formatting and Parsing
1692 @code{NumberFormat} is an abstract class for formatting and parsing numbers.
1693 The class @code{DecimalFormat} provides a concrete subclass that handles
1694 this is in a locale independent manner. As with @code{SimpleDateFormat},
1695 this class gets information on how to format numbers from a class that
1696 wrappers a collection of locale specific formatting values. In this case,
1697 the class is @code{DecimalFormatSymbols}. That class reads its default
1698 values for a locale from the resource bundle. The required entries are:
1701 @item DecimalFormatSymbols - A @code{Boolean} wrappering @code{true} to
1702 indicate that this functional area is supported.
1703 @item currencySymbol - The string representing the local currency.
1704 @item intlCurrencySymbol - The string representing the local currency in an
1705 international context.
1706 @item decimalSeparator - The character to use as the decimal point as a
1708 @item digit - The character used to represent digits in a format string,
1710 @item exponential - The char used to represent the exponent separator of a
1711 number written in scientific notation, as a @code{String}.
1712 @item groupingSeparator - The character used to separate groups of numbers
1713 in a large number, such as the ``,'' separator for thousands in the US, as
1715 @item infinity - The string representing infinity.
1716 @item NaN - The string representing the Java not a number value.
1717 @item minusSign - The character representing the negative sign, as a
1719 @item monetarySeparator - The decimal point used in currency values, as a
1721 @item patternSeparator - The character used to separate positive and
1722 negative format patterns, as a @code{String}.
1723 @item percent - The percent sign, as a @code{String}.
1724 @item perMill - The per mille sign, as a @code{String}.
1725 @item zeroDigit - The character representing the digit zero, as a @code{String}.
1728 Note that several of these values are an individual character. These should
1729 be wrappered in a @code{String} at character position 0, not in a
1730 @code{Character} object.