Updated URLs for Java I/O

One of the problems with writing about Internet topics like Java is that most of the good references and resources you cite are on the Web. Unfortunately, this means that the references tend to go stale faster than a traditional footnote. This page provides updated links to the significant URLs cited in each chapter of Java I/O. (URLs only used as examples of URLs, and not as references are generally omitted.) URLs that have changed since the first printing are listed in bold.

Almost 20% of the links on this page changed in just the couple of months between final copy edit and actual publication. If you should find any others that have changed, please drop me a line.

Preface

Chapter 1: Introducing I/O

Chapter 2: Output Streams

Chapter 3: Input Streams

Chapter 4: File Streams

No URLs in this chapter

Chapter 5: Network Streams

Chapter 6: Filter Streams

Chapter 7: Data Streams

No URLs in this chapter

Chapter 8: Streams in Memory

Chapter 9: Compressing Streams

Chapter 10: Encrypting Streams

Chapter 11: Object Serialization

No URLs

Chapter 12: Working with Files

Chapter 13: File Dialogs and Choosers

No URLs in Chapter 13

Chapter 14: Multilingual Character Sets and Unicode

Chapter 15: Readers and Writers

No URLs in Chapter 15

Chapter 16: Formatted I/O

Chapter 17: The Java Communications API

Appendix A: Additional Resources

When I began work on this book, I thought it would take me about 200 pages and about two months. Now, more than a year and five hundred pages later, I can see that I/O is a far larger, more important, and more encompassing topic than I originally guessed. Many chapters could easily lead to books of their own. Indeed several (Chapter 5, Network Programming, Chapter 9, Cryptography) already are other books.

Since I can't possibly say everything there is to say about all these fascinating topics I've touched on in one page or another in this tome, I'd like to point you to several books, mailing lists, and web sites, that explore some of the issues raised in this book in greater detail. Some of these are I/O specific; some are mostly tangential. However, they're all interesting and worthy of further study and thought.

Digital Think

Digital Think (http://www.digitalthink.com/) offers Web-based training courses for programmers, developers, system administrators and end users in C, C++, Java, Windows, Web development, object oriented programming, and more. This book grew out of two Web-based courses I wrote for Digital Think, Java Streams (http://www.digitalthink.com/catalog/cs/cs108/) and Java Readers and Writers (http://www.digitalthink.com/catalog/cs/cs208/). Although this book is far more comprehensive than those two courses, they're a good way to get started with this material, especially if you think you need a personal helping hand or a leg up. Each course includes graded exercises, a hands-on course project, and tutors to answer your questions and assist you with the difficult parts.

Design Patterns

At the time I was writing the first draft of this book, I also happened to be learning about design patterns. Gradually it became obvious that much of the AWT was written by programmers who had patterns on the brain. The java.awt.Toolkit class is a textbook example of the Abstract Factory pattern. The URL class's openConnection() method is a factory method. The Reader and Writer classes are Decorators on top of InputStream and OutputStream. The engine classes in the JCE are proxies, and I could cite many more examples. Much of the class library--including the java.io package--has been designed with design patterns, and it will all make a lot more sense if you're familiar with the standard patterns.

The seminal text on the subject is Design Patterns by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides (Addison Wesley, 1995). The four authors are colloquially known as the "Gang of Four", and the book is often cited informally as "GoF". The 23 patterns covered in GoF are rapidly becoming part of the vocabulary of the object oriented programming community. Design patterns are also beginning to be covered in many more introductory books about object oriented programming and Java.

There are also several extremely active mailing lists and web sites devoted to Design Patterns. To subscribe to the patterns@cs.uiuc.edu list send email to patterns-request@cs.uiuc.edu with the word subscribe in the Subject: field. Archives of this and several related lists may be perused at http://www.DistributedObjects.com/portfolio/archives/patterns/index.html.

The java.io package

The original source for much of the information contained herein about I/O is the javadoc documentation for the java.io package. You should have downloaded this with the JDK, but it's also available online at

The class library documentation is, however, woefully incomplete. While it explains what each method does, it often fails to explain how, why, or when you should use those methods. Furthermore, it only occasionally discusses assumptions about the behavior of those methods; assumptions that are crucial for anyone not merely using but also subclassing particular classes. There are many implicit assumptions about what particular methods should do, for instance that a close() method of a filter input stream also closes any other streams it's connected to, and these are generally not documented anywhere (or at least they weren't until I wrote this book).

I've tried to document all of these assumptions in this book, but if you're faced with a new class not covered here, the canonical reference is the source code itself. The JDK includes Java source code for the java packages. You'll find it in a file called src.zip in your JDK distribution. Sometimes the only way to figure out exactly what Sun intended particular classes to do or how they expected them to do it is to read the source code for those classes.

Network Programming

In many ways this book is a prequel to my previous book with O'Reilly, Java Network Programming. Although written first, Java Network Programming presumes a solid familiarity with input and output, streams, and readers and writers as discussed in this book. Java Network Programming explains the fundamental protocols and technology that underlie the Internet, shows you how to communicate with sockets, provides detailed examples of working network clients and servers, and even develops content and protocol handlers. If you want to learn more about TCP/IP, HTTP, URLs, sockets and server sockets, and other elements of Internet programming in Java, you should definitely pick up Java Network Programming. (There's probably an ad for it in the back of this very book.)

The Centre for Distance-spanning Technology, CDT, runs the unmoderated java-networking@cdt.luth.se list for informal discussion of Java network programming which I participate in. To subscribe, send an email containing the word subscribe in the body of the message to java-networking-request@cdt.luth.se. An archive of the list and complete instructions are available from

http://www.cdt.luth.se/~peppar/java/java-networking-list/

Data Compression

Java supports several related compression formats including zlib, deflate, and gzip. These formats are documented in RFCs 1950, 1951, and 1952, and are available wherever RFCs are found including http://www.faqs.org/rfcs/. The master site for these particular RFCs is

Java's compression classes are native wrappers around the ZLIB compression library written by Jean-Loup Gailly and Mark Adler. You can learn about this library at http://www.cdrom.com/pub/infozip/zlib/

For more general information about compression and archiving algorithms and formats, the comp.compression FAQ is a good place to start. See http://www.faqs.org/faqs/compression-faq/part1/preamble.html. More technical details and sample code in C for a variety of algorithms are available in The Data Compression Book by Mark Nelson and Jean-Loup Gailly (M&T Books, 1996, ISBN 1-55851-434-1).

The JAR file format was developed by Sun for Java. The full specification can be found at http://java.sun.com/products/jdk/1.2/docs/guide/jar/jarGuide.html (Java 1.2) or http://java.sun.com/products/jdk/1.2/docs/guide/jar/jarGuide.html (Java 1.1). Aside from the name, the only thing that really distinguishes a JAR file from a ZIP file is the optional manifest of the contents. The manifest format specification can be found at

http://java.sun.com/products/jdk/1.2/docs/guide/jar/manifest.html.

Encryption and Related Technology

Chapter 10 only began to explore the fascinating subject of cryptography. The JCE is explicated in much more detail by Jonathan Knudsen in Java Cryptography (O'Reilly, 1998). Java Cryptography expands on the coverage of the Cipher and MessageDigest classes you'll find in this book. It also includes thorough discussions of the java.security package and the Java Cryptography Extension, showing you how to use security providers and even implement your own provider. It discusses authentication, key management, public and private key encryption, and includes a secure talk application that encrypts all data sent over the network. If you write Java programs that communicate sensitive data, you'll find this book indispensable.

For a more in-depth look at the mathematics and protocols that underlie the JCE, you'll want to check out Bruce Schneier's Applied Cryptography (John Wiley & Sons, 1995). This is the standard practical text on cryptographic protocols and algorithms, and the attacks on them. Schneier discusses a wide range of cryptographic algorithms, key management and exchange schemes, one way hash functions, signature algorithms, and many other problems in sufficient detail to allow a competent programmer to implement them. Although Schneier's language of choice is C, the techniques discussed are applicable in any language. The formal specification of the Java Cryptography API is available from Sun at http://java.sun.com/products/jdk/1.2/docs/guide/security/CryptoSpec.html. The actual implementation is in beta at the time of this writing, and can be downloaded from http://java.sun.com/products/jce/index.html.

Object Serialization

Sun's serialization web page at http://java.sun.com/products/jdk/1.2/docs/guide/serialization/ includes a FAQ list, sample code, and the complete object serialization specification. The specification covers serialization as implemented in Java 1.2, which is mostly upwards compatible with the Java 1.1 serialization discussed in Chapter 11. An earlier pre-beta specification that covers Java 1.0.2 serialization is posted at http://java.sun.com/products/jdk/rmi/doc/serial-spec/serialTOC.doc.html. A formal specification of Java 1.1 serialization was never published. However, the Java 1.2 spec is mostly the same with the addition of a few extra features like the readResolve() method.

Sun's formal specification for object serialization is not always clear, especially when it comes to motivating the more esoteric areas of serialization like ObjectInputValidation. However, it is complete and does add some to what I discussed in Chapter 11, including the binary protocol for serialized objects and .ser files.

Object serialization was originally developed to support Remote Method Invocation (RMI), an architecture that allows Java objects in one virtual machine to invoke methods on objects in another virtual machine, possibly running on a different computer somewhere else on the Internet. RMI is discussed briefly in Chapter 14 of my Java Network Programming and at great length in Jim Farley's Java Distributed Computing (O'Reilly, 1998, ISBN 1-56592-206-9).

Object serialization is also used extensively as part of the JavaBeans component software architecture, a standard part of Java 1.1 and later. To learn more about this, I recommend you pick up Robert Englander's Developing JavaBeans (O'Reilly, 1997, ISBN 1-56592-289-1) or my own JavaBeans: Developing Component Software in Java (IDG Books, 1997, 0-76458-052-3).

International Character Sets and Unicode

The canonical reference to Unicode is The Unicode Specification 2.0 (Addison Wesley, 1996, ISBN 0-201-48345-9). This book features detailed analysis of the Unicode Standard as well as discussion of the difficulties of defining character sets for all the world's different languages. It's also got tables of almost all the defined characters in Unicode, including about 20,000 Han ideographs. The size of the book and the large number of interesting tables of different scripts from around the world make it a good choice for a techie coffee table book that can even amuse your liberal arts friends. Updates, corrections, and errata to that volume are available on the Web at http://www.unicode.org/

There's no single source of information for all the different non-Unicode character sets Java readers and writers can translate. However most of the Windows character sets are enumerated in Developing International Software for Windows 95 and NT by Nadine Kano (Microsoft Press, 1995, ISBN 1-55615-840-8). Kano ignores non-Windows platforms, and she does occasionally sound too much like a Microsoft press release. Nonetheless, this book contains a lot of useful details about how various localized versions of Windows operate. This book is also available on the MSDN Online Library web site at http://premium.microsoft.com/msdn/library/. Registration is required, but otherwise it's free. Assuming Microsoft hasn't added an actually navigable interface to MSDN by the time you read this, you'll find it by clicking on "Books" in the left-hand frame, then clicking on "Developing International Software". (I normally wouldn't bother you with such details, but the interface really is painfully obscure.)

Roman Czyborra maintains a lot of useful information about various ISO 8859 and Cyrillic character sets on his web site at http://czyborra.com/ including charts of a wide range of character sets and code pages.

Ken Lunde's CJKV Information Processing: Chinese, Japanese, Korean & Vietnamese Computing (O'Reilly & Associates, 1999, ISBN 1-56592-224-7) is the most comprehensive English language reference to developing code for ideographic and other Far Eastern languages and scripts. To some extent this book is based on his free CJK.INF file available from ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf.

Finally, for a fascinating look at about 500 of the world's languages and the scripts they use, check out Kenneth Katzner's Languages of the World (Routledge, 1995). This small paperback describes and provides samples of about 500 of the world's languages from the extremely popular (English and Chinese) to the painfully obscure (Romansch, Komi, Ostyak).

Java Communications API

This may well be the first book to cover the Java Communications API. Sun includes a limited amount of documentation with the Java Communications API itself, mostly javadoc class library documentation. The latter is also available from Sun's web site at http://java.sun.com/products/javacomm/javadocs/packages.html.

The RS232 serial port and IEEE1284 parallel port standards predate the Web and widespread use of the Internet. Thus these standards are still available only on dead trees for the moment. A number of books do cover them in reasonable detail including Scott Mueller's Upgrading and Repairing PCs, 10th edition (Que, 1998).

Several books discuss writing port aware programs in a variety of languages. Although none yet use Java, it's generally not hard to translate from the low level C or Basic code to the equivalent code that uses the Java Communications API. The best book I've found for parallel ports is Jan Axelson's Parallel Port Complete (Lakeview Research 1996, ISBN 096508191-5).

There are more choices for serial port books, but the most comprehensive one is certainly Joe Campbell's C Programmer's Guide to Serial Communications (Sams, 1993, ISBN 0-672-30286-1). Despite the title, the first half of this 900 page tome is an exhaustive treatment of more less language independent serial communication hardware and protocols from 19th century telegraphy to the present day.

Updates and Breaking News

In the fast-moving world of Java it's an effort to publish a book that isn't out of date by the time it reaches store shelves. Most of what I've written about in this book seems fairly stable. However, there will undoubtedly by many new developments after publication. The following three web sites can help you stay abreast of new technologies and strategies for Java I/O.

Cafe au Lait

My Cafe au Lait site at http://metalab.unc.edu/javafaq/ features almost daily news updates about Java topics. I pay special attention to new material that's closely related to my books like I/O and networking libraries. Cafe au Lait also features many resources to help you develop your Java programming skills including FAQ lists, tutorials, course notes, examples, exercises, book reviews, and more. Of particular interest will be the Java I/O page at http://metalab.unc.edu/javafaq/books/javaio/. I'll post corrections and updates to this book there as necessary.

java.oreilly.com

O'Reilly's official Java site at http://java.oreilly.com/ contains feature articles and links to the official O'Reilly sites for all our Java books. You can peruse the rather impressive O'Reilly Java catalog (18 books and counting) and view descriptions, author bios, tables of contents, indexes, reviews, exercises, examples, errata, and reader comments for all the books (including this one).

JavaWorld

I/O isn't the sexiest topic in the programming community but it is one of the most important. IDG's JavaWorld (http://www.javaworld.com/) is to be commended for treating I/O on an equal footing with sexier topics like JavaBeans and the Java Media APIs. JavaWorld publishes monthly how-to articles, book reviews, news, and more. They're particularly notable for providing short, technical articles that show you how to do things Sun's only hinted at and how to work around common problems programmers face.

Appendix B: Character Sets

The Unicode Standard Version 2.0, by the Unicode Consortium, ISBN 0-201-48345-9. Updates to that book can be found at http://www.unicode.org/. Table C-4 lists the encodings that Java, javac, and native2ascii understand. Detailed information about how these character sets map to Unicode can be found in the various files at ftp://ftp.unicode.org/Public/MAPPINGS/.


[ Cafe au Lait | Java I/O Home | Examples | Index | Order ]

Copyright 1999 Elliotte Rusty Harold
elharo@metalab.unc.edu
Last Modified December 16, 1999