Questions in 1999

July 6, 1999: Open Source Development Tools

I'm currently thinking about revising my Intro to Java Programming Course to split across two semesters, one focusing on basic object oriented principles and techniques and the other focusing on GUI design and programming. Right now I'm thinking about the first semester, and part of what I'd like to do is use better tools such as debuggers, profilers, black and white box testers and so forth. This is in an academic setting where I can't require students to purchase anything, and where I don't want them to depend on proprietary systems. Students are about evenly split between Unix and Windows with an occasional Mac thrown in for spice. Consequently I am looking for free, preferably open source, and preferably cross-platform tools including:

Profilers: The JDK 1.2 includes basic profiling information as an undocumented option to javac. Is there anything better?
Testing Tools: Right now I'm working with Kent Beck and Erich Gamma's JUnit. What else is out there? Are there any automated testing tools? Either white box or black box? Source code or byte code? Is there such a thing as Lint for Java? I'm aware of JProbe but it's way too expensive for use in a class.
Debuggers: All I know of so far is jdb. Are there any others? Has anyone written any good documentation or tutorials for jdb?
IDEs: One of the hardest things for novices to grasp is the CLASSPATH and associated connections between .class and .java files in different packages and directories. I'd say half of my students spend hours banging their heads against really trivial problems when I first introduce packages. I'd like a simple IDE that isolates programmers from CLASSPATH details. Otherwise it should have pluggable features that allow the easy integration of different editors, compilers, runtimes, profilers, debuggers, and other tools. I do not specifically need visual RAD (rapid application development) tools at this time.

Before responding, please note what I am not looking for. I am aware of the various payware solutions like Visual Cafe, JBuilder, Visual Age for Java, NetBeans, and so forth (including their watered down academic and evaluation versions). These most definitely do not suit my needs for this project. Beyond that, any suggestions you have are greatly appreciated. Please email them to elharo@ibiblio.org. No free books this week, but I hope to eventually incorporate the responses into an improved course for everyone to learn from.

Responses

June 18, 1999: Strange behavior in java.io.InputStream

The multi-byte read() method in java.io.InputStream and its subclasses has this signature:

public int read(byte[] input, int offset, int length) throws IOException

This method reads length bytes of data from the input stream into the array input beginning at the index offset. Now consider this code fragment. Pay special attention to array indexes. The in variable is an InputStream from the standard class library.

    byte[] input = new byte[5];
    
    try { // read all five bytes from data into input
      in.read(input, 0, 5);
      System.out.println("in.read(input, 0, 5) succeeded");
    }
    catch (Exception e) {
      System.out.println("in.read(input, 0, 5) failed");
    }


    try { // read into the fifth byte of input
      in.read(input, 5, 0);
      System.out.println("in.read(input, 5, 0) succeeded");
    }
    catch (Exception e) {
      System.out.println("in.read(input, 5, 0) failed");
    }

    try { // read into the sixth byte of input
      in.read(input, 6, 0);
      System.out.println("in.read(input, 6, 0) succeeded");
    }
    catch (Exception e) {
      System.out.println("in.read(input, 6, 0) failed");
    }

There are three reads here: one read of five bytes starting at 0, one read of 0 bytes starting at 5, and one read of 0 bytes starting at 6. The input array has five bytes with indices 0 through 4. Thus both 5 and 6 are out of bounds for this array. Assuming in can provide at least five bytes of data, which of these reads succeed and which fail?

Most people's first reaction is that the first read succeeds and the second two fail with ArrayIndexOutOfBoundsExceptions. Most peoples' second reaction on further reflection is that maybe all three reads succeed because the second two don't actually read any bytes and don't need to store anything in the input array. In fact, the truth is stranger still. The first two reads succeed; the third fails. If you don't believe me, run the code and see. All multi-byte read() methods in the Sun-supplied input stream classes behave like this.

How this happens is not today's question. That's easy to answer through a quick peek at the source code which looks a lot like this: (This next fragment is actually taken from Example 6-9 in Java I/O, DumpFilter, since I don't want to reveal Sun's source code and contaminate anyone doing a clean room implementation; but all the Sun classes behave the same.)

  public int read(byte[] data, int offset, int length) throws IOException {
  
    if (data == null) {
      throw new NullPointerException();
    } 
    else if ((offset < 0) || (offset > data.length) || (length < 0) 
     || ((offset + length) > data.length) || ((offset + length) < 0)) {
      throw new ArrayIndexOutOfBoundsException();
    } 
    else if (length == 0) {
      return 0;
    }

    // check for end of stream
    int datum = this.read();
    if (datum == -1) {
      return -1;
    }
    
    data[offset] = (byte) datum;

    int bytesRead = 1;
    try {
      for (; bytesRead < length ; bytesRead++) {
      
        datum = this.read();
        
        // in case of end of stream, return as much as we've got,
        //  then wait for the next call to read to return -1
        if (datum == -1) break;
        data[offset + bytesRead] = (byte) datum;
      }
    }
    catch (IOException e) {
      // return what's already in the data array
    }
    
    return bytesRead;   
    
  }

The strange behavior is all a result of this if-else if-else construct:

    if (data == null) {
      throw new NullPointerException();
    } 
    else if ((offset < 0) || (offset > data.length) || (length < 0) 
     || ((offset + length) > data.length) || ((offset + length) < 0)) {
      throw new ArrayIndexOutOfBoundsException();
    } 
    else if (length == 0) {
      return 0;
    }

In particular, it's a result of using ((offset + length) > data.length) instead of ((offset + length) >= data.length). If >= were used instead of >, the ArrayIndexOutOfBoundsException would be thrown whenever the offset was out of bounds for the array. However, as matters stand now it's only thrown if the offset is out of bounds by at least 2. Now after all that setup, here's the question of the week:

Why is read(byte[] data, int offset, int length) implemented in this fashion? What was going through Sun's heads when they designed this? Is this a bug? An oversight? Or is there a deliberate reason to allow zero-bytes reads into the last-plus-one element of the array, but not into subsequent non-existent elements? What do you think?

A free signed copy of Java I/O goes to the best answer to this question. Thanks once again to David Vriend for suggesting this question.

Responses

June 3, 1999

David Vriend noted a problem in Example 3-3 of Java I/O, StreamCopier, as well as several similar examples from that book. The copy() method attempts to synchronize on the input and output streams to "not allow other threads to read from the input or write to the output while copying is taking place". Here's the relevant method:

  public static void copy(InputStream in, OutputStream out) 
   throws IOException {

    // do not allow other threads to read from the
    // input or write to the output while copying is
    // taking place
    
    synchronized (in) {
      synchronized (out) {

        byte[] buffer = new byte[256];
        while (true) {
          int bytesRead = in.read(buffer);
          if (bytesRead == -1) break;
          out.write(buffer, 0, bytesRead);
        }
      }
    }
  }

However, this only helps if the other threads using those streams are also kind enough to synchronize them. In the general case, that seems unlikely. The question is this: is there any way to guarantee thread safety in a method like this when:

You're trying to write a library routine to be used by many different programmers in their own programs so you can't count on the rest of the program outside this utility class being written in a thread safe fashion.
You have not written the underlying classes that need to be thread safe (InputStream and OutputStream in this example) so you can't add synchronization directly to them.
Wrapping the unsynchronized classes in a synchronized class is insufficient because the underlying unsynchronized class may still be exposed to other classes and threads.

Note that although the specific instance of this question deals with streams, the actual question is really more about threading. Since anyone answering this question probably already has a copy of Java I/O, I'll send out a free copy of XML: Extensible Markup Language for the best answer.

Responses