Skip to Main Content

Java APIs

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Selector.select(timeout) seems to sometimes "timeout" way too soon

942985Jun 11 2012 — edited Jun 11 2012
Hi,

We're using NIO for reading and writing from Telnet sockets, and occasionally we're finding that java.nio.channels.Selector.select(timeout) seems to return prematurely. It's supposed to return "only after at least one channel is selected, this selector's wakeup method is invoked, the current thread is interrupted, or the given timeout period expires, whichever comes first" but we're seeing it return when none of those conditions are true. With a 15-second timeout it's returning after about 3 seconds.

I'm posting here before making a bug report to see if any of you all can catch something we're doing wrong. Also because we frustratingly do not have a repro in house - we are only seeing this on a customer's site. Distressingly, the bug only happens if we use a newer jre. With version 1.6.0_07 there is no repro; with 1.6.0_26 and 1.7.0_04 there is.

I see a similar bug here which was resolved as not a bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5035864
However we're not using the return value the same way.

We're creating the selector like this:
    public void connect(String hostname, int port)  {
       
        // hook up the underlying socket
        InetSocketAddress addr;
        try {
            // open socket
            addr = new InetSocketAddress(InetAddress.getByName(hostname), port);
            channel = SocketChannel.open();
            channel.configureBlocking(true); // while connecting
            channel.connect(addr);

            // switch to non-blocking mode now that we've connected okay
            channel.configureBlocking(false);

            // open selector
            selector = Selector.open();
            channel.register(selector, SelectionKey.OP_READ);
     
        } catch (Exception  e) {
            //handle
        } 
    }
And using it thus:
    private void selectRead(Selector selector, long requestStartMilli, long timeoutMilli) throws IOException {
         long remainingMilli;
        if (timeoutMilli == 0) {  // timeout of 0 also means block indefinitely 
            remainingMilli = 0;
        } else {
            remainingMilli = timeoutMilli - (System.currentTimeMillis() - requestStartMilli) ;
            if (remainingMilli < 0) {
                // timeout elapsed since the last time we blocked, so short circuit
                log.debug("[TSC] Timeout elapsed since last read");
                throw new IOTimeoutException("Timeout elapsed since last read");
            }
        }


        // block for an event
           log.debug("requestStartMilli " + requestStartMilli + " timeoutMilli " + timeoutMilli + " remainingMilli " + remainingMilli);
           log.debug("current nanos before calling connect " + System.nanoTime() );
	Log.debug(" total number of keys "+  selector.keys().size() + " selected keys " + selector.selectedKeys());
	
           int numkeys = selector.select(remainingMilli);

          log.debug("current nanos after calling connect " + System.nanoTime() + " got num keys " + numkeys );

            //  has something bad happened?
           if (!selector.isOpen() || !channel.isOpen())     throw new IOException("Channel or selector not open");     
            if (Thread.interrupted())                                     throw new InterruptedIOException(); 

        Set<SelectionKey> selectedKeys = selector.selectedKeys();
        Iterator<SelectionKey> selectedKeysItr = selectedKeys.iterator();
        if (selectedKeysItr.hasNext()) {
            SelectionKey selected = selectedKeysItr.next();
            selectedKeysItr.remove();

	 // do stuff           
	
        } else {
            // the timeout elapsed with no data from the server
            log.debug("[TSC] Timeout elapsed while blocking");
            throw new IOTimeoutException("Timeout elapsed blocking");
        }

    }
About once every 20th time we connect and read, we see the inappropriately fast return.
2012-06-08 13:37:40,703 DEBUG  requestStartMilli 1339177060687 timeoutMilli 15000 remainingMilli 14984 additionalTimeoutMilli 0
2012-06-08 13:37:40,703 DEBUG  current nanos before calling connect 341011346120112
2012-06-08 13:37:40,703 DEBUG  total number of keys 1 selected keys []
2012-06-08 13:37:40,718 DEBUG  current nanos after calling connect  341011349665502 got num keys 0
2012-06-08 13:37:40,718 DEBUG  [TSC] Timeout elapsed while blocking
Are we doing something wrong? Can anybody guess why we'd be unable to repro in house? Any other theories?

Thanks very much.
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on Jul 9 2012
Added on Jun 11 2012
7 comments
2,049 views