GnuParser flatten case study

From Ssdlpedia

Jump to: navigation, search

This example demonstrates the use of spartan programming for the simplification of part of the Apache implementation of a parser for command line options, as found in class GnuParser in the org.apache.commons.cli package. The particular version used was 1.1, as downloaded on July 19th, 2008 from [1].

In doing these simplifications, no real attempt was made at deep understanding of the semantics of the code. Most transformations could have been applied by an automatic tool. Deeper understanding could have lead to a more major rewrite, resulting in simpler code, or at fixing the bugs found in it in a different manner.

Contents

[edit] The Input

Our input consists of a single class, with three members:

  • A single data data member, defined by
private ArrayList tokens = new ArrayList();
  • An initialized method named init:
private void init()
{
    tokens.clear();
}
  • A 116-lines, 532-tokens line function named flatten.

We will be interested mostly in function flatten which makes the bulk of the input. Initially, the class reads

/**
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.apache.commons.cli;
 
import java.util.ArrayList;
 
/**
 * The class GnuParser provides an implementation of the 
 * {@link Parser#flatten(Options,String[],boolean) flatten} method.
 *
 * @author John Keyes (john at integralsource.com)
 * @see Parser
 * @version $Revision: 542151 $
 */
public class GnuParser extends Parser {
 
    /** holder for flattened tokens */
    private ArrayList tokens = new ArrayList();
 
    /**
     * <p>Resets the members to their original state i.e. remove
     * all of <tt>tokens</tt> entries.
     */
    private void init()
    {
        tokens.clear();
    }
 
    /**
     * <p>This flatten method does so using the following rules:
     * <ol>
     *  <li>If an {@link Option} exists for the first character of 
     *  the <tt>arguments</tt> entry <b>AND</b> an {@link Option} 
     *  does not exist for the whole <tt>argument</tt> then
     *  add the first character as an option to the processed tokens
     *  list e.g. "-D" and add the rest of the entry to the also.</li>
     *  <li>Otherwise just add the token to the processed tokens list.
     *  </li>
     * </ol>
     * </p>
     *
     * @param options The Options to parse the arguments by.
     * @param arguments The arguments that have to be flattened.
     * @param stopAtNonOption specifies whether to stop 
     * flattening when a non option has been encountered
     * @return a String array of the flattened arguments
     */
    protected String[] flatten(Options options, String[] arguments, 
                               boolean stopAtNonOption)
    {
        init();
 
        boolean eatTheRest = false;
        Option currentOption = null;
 
        for (int i = 0; i < arguments.length; i++)
        {
            if ("--".equals(arguments[i]))
            {
                eatTheRest = true;
                tokens.add("--");
            }
            else if ("-".equals(arguments[i]))
            {
                tokens.add("-");
            }
            else if (arguments[i].startsWith("-"))
            {
                Option option = options.getOption(arguments[i]);
 
                // this is not an Option
                if (option == null)
                {
                    // handle special properties Option
                    Option specialOption = 
                            options.getOption(arguments[i].substring(0, 2));
 
                    if (specialOption != null)
                    {
                        tokens.add(arguments[i].substring(0, 2));
                        tokens.add(arguments[i].substring(2));
                    }
                    else if (stopAtNonOption)
                    {
                        eatTheRest = true;
                        tokens.add(arguments[i]);
                    }
                    else
                    {
                        tokens.add(arguments[i]);
                    }
                }
                else
                {
                    // WARNING: Findbugs reports major problems with the following code. 
                    //          As option cannot be null, currentOption cannot and 
                    //          much of the code below is never going to be run.
 
                    currentOption = option;
 
                    // special option
                    Option specialOption = 
                            options.getOption(arguments[i].substring(0, 2));
 
                    if ((specialOption != null) && (option == null))
                    {
                        tokens.add(arguments[i].substring(0, 2));
                        tokens.add(arguments[i].substring(2));
                    }
                    else if ((currentOption != null) && currentOption.hasArg())
                    {
                        if (currentOption.hasArg())
                        {
                            tokens.add(arguments[i]);
                            currentOption = null;
                        }
                        else if (currentOption.hasArgs())
                        {
                            tokens.add(arguments[i]);
                        }
                        else if (stopAtNonOption)
                        {
                            eatTheRest = true;
                            tokens.add("--");
                            tokens.add(arguments[i]);
                        }
                        else
                        {
                            tokens.add(arguments[i]);
                        }
                    }
                    else if (currentOption != null)
                    {
                        tokens.add(arguments[i]);
                    }
                    else if (stopAtNonOption)
                    {
                        eatTheRest = true;
                        tokens.add("--");
                        tokens.add(arguments[i]);
                    }
                    else
                    {
                        tokens.add(arguments[i]);
                    }
                }
            }
            else
            {
                tokens.add(arguments[i]);
            }
 
            if (eatTheRest)
            {
                for (i++; i < arguments.length; i++)
                {
                    tokens.add(arguments[i]);
                }
            }
        }
 
        return (String[]) tokens.toArray(new String[tokens.size()]);
    }
}

[edit] Almost Automatic Simplification

Applying braces elimination to this function reduces the number of tokens complexity metric by 5% to 503. Laying out the function in K&R formatting style reduces its line number by 44% to 63. The resulting function is

protected String[] flatten(Options options, String[] arguments, boolean stopAtNonOption) {
    init();
    boolean eatTheRest = false;
    Option currentOption = null;
    for (int i = 0; i < arguments.length; i++) {
        if ("--".equals(arguments[i])) {
            eatTheRest = true;
            tokens.add("--");
        } else if ("-".equals(arguments[i]))
            tokens.add("-");
        else if (arguments[i].startsWith("-")) {
             Option option = options.getOption(arguments[i]);
            // this is not an Option
            if (option == null) {
                // handle special properties Option
                 Option specialOption = options.getOption(arguments[i].substring(0, 2));
                if (specialOption != null) {
                    tokens.add(arguments[i].substring(0, 2));
                    tokens.add(arguments[i].substring(2));
                } else if (stopAtNonOption) {
                    eatTheRest = true;
                    tokens.add(arguments[i]);
                } else
                    tokens.add(arguments[i]);
            } else {
                // WARNING: Findbugs reports major problems with the following code.
                // As option cannot be null, currentOption cannot and
                // much of the code below is never going to be run.
                currentOption = option;
                // special option
                 Option specialOption = options.getOption(arguments[i].substring(0, 2));
                if (specialOption != null && option == null) {
                    tokens.add(arguments[i].substring(0, 2));
                    tokens.add(arguments[i].substring(2));
                } else if (currentOption != null && currentOption.hasArg()) {
                    if (currentOption.hasArg()) {
                        tokens.add(arguments[i]);
                        currentOption = null;
                    } else if (currentOption.hasArgs())
                        tokens.add(arguments[i]);
                    else if (stopAtNonOption) {
                        eatTheRest = true;
                        tokens.add("--");
                        tokens.add(arguments[i]);
                    } else
                        tokens.add(arguments[i]);
                } else if (currentOption != null)
                    tokens.add(arguments[i]);
                else if (stopAtNonOption) {
                    eatTheRest = true;
                    tokens.add("--");
                    tokens.add(arguments[i]);
                } else
                    tokens.add(arguments[i]);
            }
        } else
            tokens.add(arguments[i]);
        if (eatTheRest)
            for (i++; i < arguments.length; i++)
                tokens.add(arguments[i]);
    }
    return tokens.toArray(new String[tokens.size()]);
}

Reformatting highlights a comment in the code which indicates a potential problem.

// WARNING: Findbugs reports major problems with the following code.
// As option cannot be null, currentOption cannot and
// much of the code below is never going to be run.

We will ignore this problem for now.

[edit] Early Exits

Examining the body of the main loop, we see that it has two parts:

  • A set of complex conditionals, determining what to do with the currently processed argument.
  • A loop for consuming the rest of the arguments, in case processing of the current argument reveals that this is the appropriate action.

Since the different cases of processing the argument in the first part are mutually exclusive, and the processing of each of these is separate than the others, it makes sense to apply the early exits technique. The only concern is that in some of the cases, marked by eatTheRest = true the function needs to consume the rest of the arguments. The current code uses complex control to achieve this goal: an internal, second loop, advances the index variable of the outer main loop. When the internal loop terminates, the outer loop discovers that it has run out of legal values for the index variable, and terminates accordingly.

The following transformation unnests the second loop and eliminates the eatTheRes flag variable. If in the course of the first loop it is discovered that the rest of the arguments are to be consumed, then iteration of this loop is explicitly stopped with the use of a break statement. The result is

protected String[] flatten( Options options,  String[] arguments,  boolean stopAtNonOption) {
    init();
    Option currentOption = null;
    int i;
    for (i = 0; i < arguments.length; i++)
        if ("--".equals(arguments[i])) {
            tokens.add("--");
            break;
        } else if ("-".equals(arguments[i]))
            tokens.add("-");
        else if (arguments[i].startsWith("-")) {
             Option option = options.getOption(arguments[i]);
            // this is not an Option
            if (option == null) {
                // handle special properties Option
                 Option specialOption = options.getOption(arguments[i].substring(0, 2));
                if (specialOption != null) {
                    tokens.add(arguments[i].substring(0, 2));
                    tokens.add(arguments[i].substring(2));
                } else if (stopAtNonOption) {
                    tokens.add(arguments[i]);
                    break;
                } else
                    tokens.add(arguments[i]);
            } else {
                // WARNING: Findbugs reports major problems with the following code.
                // As option cannot be null, currentOption cannot and
                // much of the code below is never going to be run.
                currentOption = option;
                // special option
                 Option specialOption = options.getOption(arguments[i].substring(0, 2));
                if (specialOption != null && option == null) {
                    tokens.add(arguments[i].substring(0, 2));
                    tokens.add(arguments[i].substring(2));
                } else if (currentOption != null && currentOption.hasArg()) {
                    if (currentOption.hasArg()) {
                        tokens.add(arguments[i]);
                        currentOption = null;
                    } else if (currentOption.hasArgs())
                        tokens.add(arguments[i]);
                    else if (stopAtNonOption) {
                        tokens.add("--");
                        tokens.add(arguments[i]);
                        break;
                    } else
                        tokens.add(arguments[i]);
                } else if (currentOption != null)
                    tokens.add(arguments[i]);
                else if (stopAtNonOption) {
                    tokens.add("--");
                    tokens.add(arguments[i]);
                    break;
                } else
                    tokens.add(arguments[i]);
            }
        } else
            tokens.add(arguments[i]);
    for (; i < arguments.length; i++)
        tokens.add(arguments[i]);
    return tokens.toArray(new String[tokens.size()]);
}

This time, the number of lines is reduced by 3% to 61 and the number of tokens drops by 4%.

[edit] Remvoing Redundant Else

The first loop body is still a complex tree of conditionals. By removing all else statements which follow a break we obtain:

protected String[] flatten( Options options,  String[] arguments,  boolean stopAtNonOption) {
    init();
    Option currentOption = null;
    int i;
    for (i = 0; i < arguments.length; i++) {
        if ("--".equals(arguments[i])) {
            tokens.add("--");
            break;
        }
        if ("-".equals(arguments[i]))
            tokens.add("-");
        else if (arguments[i].startsWith("-")) {
             Option option = options.getOption(arguments[i]);
            // this is not an Option
            if (option == null) {
                // handle special properties Option
                 Option specialOption = options.getOption(arguments[i].substring(0, 2));
                if (specialOption != null) {
                    tokens.add(arguments[i].substring(0, 2));
                    tokens.add(arguments[i].substring(2));
                } else if (stopAtNonOption) {
                    tokens.add(arguments[i]);
                    break;
                }
                tokens.add(arguments[i]);
            } else {
                // WARNING: Findbugs reports major problems with the following code.
                // As option cannot be null, currentOption cannot and
                // much of the code below is never going to be run.
                currentOption = option;
                // special option
                 Option specialOption = options.getOption(arguments[i].substring(0, 2));
                if (specialOption != null && option == null) {
                    tokens.add(arguments[i].substring(0, 2));
                    tokens.add(arguments[i].substring(2));
                } else if (currentOption != null && currentOption.hasArg()) {
                    if (currentOption.hasArg()) {
                        tokens.add(arguments[i]);
                        currentOption = null;
                    } else if (currentOption.hasArgs())
                        tokens.add(arguments[i]);
                    else {
                        if (stopAtNonOption) {
                            tokens.add("--");
                            tokens.add(arguments[i]);
                            break;
                        }
                        tokens.add(arguments[i]);
                    }
                } else {
                    if (currentOption != null)
                        tokens.add(arguments[i]);
                    else if (stopAtNonOption) {
                        tokens.add("--");
                        tokens.add(arguments[i]);
                        break;
                    }
                    tokens.add(arguments[i]);
                }
            }
        } else
            tokens.add(arguments[i]);
    }
    for (; i < arguments.length; i++)
        tokens.add(arguments[i]);
    return tokens.toArray(new String[tokens.size()]);
}

The main loop now has two nested statements. The first is a rather simple conditional, while the second is structured as follows:

if ("-".equals(arguments[i]))
    tokens.add("-");
else if (arguments[i].startsWith("-")) {
     // ... 47 lines
| else
    tokens.add(arguments[i]);

We rewrite this statement into:

if ("-".equals(arguments[i])) {
    tokens.add("-");
    continue;
} 
if (!arguments[i].startsWith("-")) {
    tokens.add(arguments[i]);
    continue;
}
// ... 47 lines

Several things happened here:

  • Since the true branch of the conditional if (arguments[i].startsWith("-")) is much longer than the false branch of this conditional (47 lines vs. only 1 line), the condition was negated to make the shorter branch appear first.
  • To reduce horizontal complexity, we added an early exit at this short branch with a continue statement, and eliminated the redundant else.
  • A similar early continue technique was applied to the conditional if ("-".equals(arguments[i]))

Overall the code now looks:

protected String[] flatten( Options options,  String[] arguments,  boolean stopAtNonOption) {
    init();
    Option currentOption = null;
    int i;
    for (i = 0; i < arguments.length; i++) {
        if ("--".equals(arguments[i])) {
            tokens.add("--");
            break;
        }
        if ("-".equals(arguments[i])) {
            tokens.add("-");
            continue;
        }
        if (!arguments[i].startsWith("-")) {
            tokens.add(arguments[i]);
            continue;
        }
         Option option = options.getOption(arguments[i]);
        // this is not an Option
        if (option == null) {
            // handle special properties Option
             Option specialOption = options.getOption(arguments[i].substring(0, 2));
            if (specialOption != null) {
                tokens.add(arguments[i].substring(0, 2));
                tokens.add(arguments[i].substring(2));
            } else if (stopAtNonOption) {
                tokens.add(arguments[i]);
                break;
            }
            tokens.add(arguments[i]);
        } else {
            // WARNING: Findbugs reports major problems with the following code.
            // As option cannot be null, currentOption cannot and
            // much of the code below is never going to be run.
            currentOption = option;
            // special option
             Option specialOption = options.getOption(arguments[i].substring(0, 2));
            if (specialOption != null && option == null) {
                tokens.add(arguments[i].substring(0, 2));
                tokens.add(arguments[i].substring(2));
            } else if (currentOption != null && currentOption.hasArg()) {
                if (currentOption.hasArg()) {
                    tokens.add(arguments[i]);
                    currentOption = null;
                } else if (currentOption.hasArgs())
                    tokens.add(arguments[i]);
                else {
                    if (stopAtNonOption) {
                        tokens.add("--");
                        tokens.add(arguments[i]);
                        break;
                    }
                    tokens.add(arguments[i]);
                }
            } else {
                if (currentOption != null)
                    tokens.add(arguments[i]);
                else if (stopAtNonOption) {
                    tokens.add("--");
                    tokens.add(arguments[i]);
                    break;
                }
                tokens.add(arguments[i]);
            }
        }
    }
    for (; i < arguments.length; i++)
        tokens.add(arguments[i]);
    return tokens.toArray(new String[tokens.size()]);
}

The result is a bit longer, but we have reduced the nesting level of 47 lines of code, the bulk of the loop body, is reduced by 1. We repeat this process further, by inserting early exits with continue at the end of the body of all branches of all conditionals, and removing all redundant continue statements.

protected String[] flatten(Options options, String[] arguments, boolean stopAtNonOption) {
    init();
    Option currentOption = null;
    int i;
    for (i = 0; i < arguments.length; i++) {
        if ("--".equals(arguments[i])) {
            tokens.add("--");
            break;
        }
        if ("-".equals(arguments[i])) {
            tokens.add("-");
            continue;
        }
        if (!arguments[i].startsWith("-")) {
            tokens.add(arguments[i]);
            continue;
        }
        Option option = options.getOption(arguments[i]);
        // this is not an Option
        if (option == null) {
            // handle special properties Option
            Option specialOption = options.getOption(arguments[i].substring(0, 2));
            if (specialOption != null) {
                tokens.add(arguments[i].substring(0, 2));
                tokens.add(arguments[i].substring(2));
                continue;
            }
            if (stopAtNonOption) {
                tokens.add(arguments[i]);
                break;
            }
            tokens.add(arguments[i]);
            continue;
        }
        // WARNING: Findbugs reports major problems with the following code.
        // As option cannot be null, currentOption cannot and
        // much of the code below is never going to be run.
        currentOption = option;
        // special option
        Option specialOption = options.getOption(arguments[i].substring(0, 2));
        if (specialOption != null && option == null) {
            tokens.add(arguments[i].substring(0, 2));
            tokens.add(arguments[i].substring(2));
            continue;
        }
        if (currentOption != null && currentOption.hasArg()) {
            if (currentOption.hasArg()) {
                tokens.add(arguments[i]);
                currentOption = null;
                continue;
            }
            if (currentOption.hasArgs()) {
                tokens.add(arguments[i]);
                continue;
            }
            if (stopAtNonOption) {
                tokens.add("--");
                tokens.add(arguments[i]);
                break;
            }
            tokens.add(arguments[i]);
            continue;
        }
        if (currentOption != null) {
            tokens.add(arguments[i]);
            continue;
        }
        if (stopAtNonOption) {
            tokens.add("--");
            tokens.add(arguments[i]);
            break;
        }
        tokens.add(arguments[i]);
        continue;
    }
    for (; i < arguments.length; i++)
        tokens.add(arguments[i]);
    return tokens.toArray(new String[tokens.size()]);
}

[edit] Eliminating Variable currentOption

The new flattened version of the code reveals that variable currentOption is just another name for variable option. To see that, notice that:

  • variable currentOption is assigned at about the middle of the main loop, where it receives the value of option;
  • variable currentOption is not used in the loop prior to this assignment.
  • the other assignment to currentOption in the loop, setting it to null is a dead assignment this assigned value can never be used.
  • variable option is not used in the loop after it was assigned to currentOption

We can therefore eliminate currentOption, using option wherever it was used.

@Override protected String[] flatten(Options options, String[] arguments, boolean stopAtNonOption) {
    init();
    int i;
    for (i = 0; i < arguments.length; i++) {
        if ("--".equals(arguments[i])) {
            tokens.add("--");
            break;
        }
        if ("-".equals(arguments[i])) {
            tokens.add("-");
            continue;
        }
        if (!arguments[i].startsWith("-")) {
            tokens.add(arguments[i]);
            continue;
        }
        Option option = options.getOption(arguments[i]);
        // this is not an Option
        if (option == null) {
            // handle special properties Option
            Option specialOption = options.getOption(arguments[i].substring(0, 2));
            if (specialOption != null) {
                tokens.add(arguments[i].substring(0, 2));
                tokens.add(arguments[i].substring(2));
                continue;
            }
            if (stopAtNonOption) {
                tokens.add(arguments[i]);
                break;
            }
            tokens.add(arguments[i]);
            continue;
        }
        // WARNING: Findbugs reports major problems with the following code.
        // As option cannot be null, currentOption cannot and
        // much of the code below is never going to be run.
        // -------------
        // special option
        Option specialOption = options.getOption(arguments[i].substring(0, 2));
        if (specialOption != null && option == null) {
            tokens.add(arguments[i].substring(0, 2));
            tokens.add(arguments[i].substring(2));
            continue;
        }
        if (option != null && option.hasArg()) {
            if (option.hasArg()) {
                tokens.add(arguments[i]);
                continue;
            }
            if (option.hasArgs()) {
                tokens.add(arguments[i]);
                continue;
            }
            if (stopAtNonOption) {
                tokens.add("--");
                tokens.add(arguments[i]);
                break;
            }
            tokens.add(arguments[i]);
            continue;
        }
        if (option != null) {
            tokens.add(arguments[i]);
            continue;
        }
        if (stopAtNonOption) {
            tokens.add("--");
            tokens.add(arguments[i]);
            break;
        }
        tokens.add(arguments[i]);
        continue;
    }
    for (; i < arguments.length; i++)
        tokens.add(arguments[i]);
    return tokens.toArray(new String[tokens.size()]);
}

[edit] Remving Dead Code

Recall now the comment stating

   WARNING: Findbugs reports major problems with the following code. As option cannot be null, currentOption cannot and much of the code below is never going to be run. 

The flattened control structure highlights the fact that option indeed cannot be null after the statement

if (option == null) {
    ....
    continue;
}

Using this fact, the code turns into:

protected String[] flatten(Options options, String[] arguments, boolean stopAtNonOption) {
    init();
    int i;
    for (i = 0; i < arguments.length; i++) {
        if ("--".equals(arguments[i])) {
            tokens.add("--");
            break;
        }
        if ("-".equals(arguments[i])) {
            tokens.add("-");
            continue;
        }
        if (!arguments[i].startsWith("-")) {
            tokens.add(arguments[i]);
            continue;
        }
        Option option = options.getOption(arguments[i]);
        // this is not an Option
        if (option == null) {
            // handle special properties Option
            Option specialOption = options.getOption(arguments[i].substring(0, 2));
            if (specialOption != null) {
                tokens.add(arguments[i].substring(0, 2));
                tokens.add(arguments[i].substring(2));
                continue;
            }
            if (stopAtNonOption) {
                tokens.add(arguments[i]);
                break;
            }
            tokens.add(arguments[i]);
            continue;
        }
        // special option
        Option specialOption = options.getOption(arguments[i].substring(0, 2));
        if (option.hasArg()) {
            if (option.hasArg()) {
                tokens.add(arguments[i]);
                continue;
            }
            if (option.hasArgs()) {
                tokens.add(arguments[i]);
                continue;
            }
            if (stopAtNonOption) {
                tokens.add("--");
                tokens.add(arguments[i]);
                break;
            }
            tokens.add(arguments[i]);
            continue;
        }
        tokens.add(arguments[i]);
        continue;
    }
    for (; i < arguments.length; i++)
        tokens.add(arguments[i]);
    return tokens.toArray(new String[tokens.size()]);
}

(In the above, the comment with nullity warning is removed) The number of lines was reduced to 59 (after it increased to 77 in the intermediate steps), and the number of tokens was reduced to 390 (after it increased to 498). These changes reflect a 3% decrease in the number of lines and a 19% decrease in the number of tokens with respect to the previous minimum of 61 lines and 484 tokens.

[edit] More Dead Code

We now notice that variable specialOption is never used, and that there are multiple checks for the same condition in

if (option.hasArg()) {
    if (option.hasArg()) {
        tokens.add(arguments[i]);
        continue;
    }
    ....
}

Obviously, the code after if is never going to execute. We can therefore write function flatten as follows:

protected String[] flatten(Options options, String[] arguments, boolean stopAtNonOption) {
    init();
    int i;
    for (i = 0; i < arguments.length; i++) {
        if ("--".equals(arguments[i])) {
            tokens.add("--");
            break;
        }
        if ("-".equals(arguments[i])) {
            tokens.add("-");
            continue;
        }
        if (!arguments[i].startsWith("-")) {
            tokens.add(arguments[i]);
            continue;
        }
        Option option = options.getOption(arguments[i]);
        // this is not an Option
        if (option == null) {
            // handle special properties Option
            Option specialOption = options.getOption(arguments[i].substring(0, 2));
            if (specialOption != null) {
                tokens.add(arguments[i].substring(0, 2));
                tokens.add(arguments[i].substring(2));
                continue;
            }
            if (stopAtNonOption) {
                tokens.add(arguments[i]);
                break;
            }
            tokens.add(arguments[i]);
            continue;
        }
        if (option.hasArg()) {
            tokens.add(arguments[i]);
            continue;
        }
        tokens.add(arguments[i]);
        continue;
    }
    for (; i < arguments.length; i++)
        tokens.add(arguments[i]);
    return tokens.toArray(new String[tokens.size()]);
}

which brings down the number of lines to 44 (a decrease of 25%) and the number of tokens to 301 (a 23% decrease).

[edit] Further Simplifications

We are now ready to make four more simplifications:

First, observe that the body of the first two if statements is the same

if ("-".equals(arguments[i])) {
    tokens.add("-");
    continue;
}
if (!arguments[i].startsWith("-")) {
    tokens.add(arguments[i]);
    continue;
}

A single test can be used for both.

Second, the last if statement

if (option.hasArg()) {
    tokens.add(arguments[i]);
    continue;
}
tokens.add(arguments[i]);
continue;

is redundant, since the same is executed in whether it is true or false, and the continue at the end is also not necessary.

Third, examining the code at

if (stopAtNonOption) {
    tokens.add(arguments[i]);
    break;
}
tokens.add(arguments[i]);
continue;

we see that it can be rewritten as

tokens.add(arguments[i]);
if (stopAtNonOption) 
    break;
continue;

Finally, we note that variable specialOption is used only once, and can be eliminated by inlining

The resulting code is

protected String[] flatten(Options options, String[] arguments, boolean stopAtNonOption) {
    init();
    int i;
    for (i = 0; i < arguments.length; i++) {
        if ("--".equals(arguments[i])) {
            tokens.add("--");
            break;
        }
        if ("-".equals(arguments[i]) || !arguments[i].startsWith("-")) {
            tokens.add(arguments[i]);
            continue;
        }
        Option option = options.getOption(arguments[i]);
        // this is not an Option
        if (option == null) {
            if (options.getOption(arguments[i].substring(0, 2)) != null) {
                tokens.add(arguments[i].substring(0, 2));
                tokens.add(arguments[i].substring(2));
                continue;
            }
            tokens.add(arguments[i]);
            if (stopAtNonOption)
                break;
            continue;
        }
        tokens.add(arguments[i]);
    }
    for (; i < arguments.length; i++)
        tokens.add(arguments[i]);
    return tokens.toArray(new String[tokens.size()]);
}

Here we have 31 lines (30% reduction) and 247 tokens (18% reduction).

[edit] Reducing Horizontal Complexity

Notice now the two nested if statement in

if (option == null) {
    if (options.getOption(arguments[i].substring(0, 2)) != null) {
        tokens.add(arguments[i].substring(0, 2));
        tokens.add(arguments[i].substring(2));
        continue;
    }
    tokens.add(arguments[i]);
    if (stopAtNonOption)
        break;
        ontinue;
}
tokens.add(arguments[i]);

We can unnest these two conditions, and reduce horizontal complexity by inverting the conditional, by writing

if (option != null) {
    tokens.add(arguments[i]);
    continue; 
}
if (options.getOption(arguments[i].substring(0, 2)) != null) {
    tokens.add(arguments[i].substring(0, 2));
    tokens.add(arguments[i].substring(2));
    continue;
}
tokens.add(arguments[i]);
if (stopAtNonOption)
    break;

This transformation also reveals that variable option is only used once and can be inlined.

The flatter version thus obtained has 3 less tokens, and the same number of lines.

protected String[] flatten(Options options, String[] arguments, boolean stopAtNonOption) {
    init();
    int i;
    for (i = 0; i < arguments.length; i++) {
        if ("--".equals(arguments[i])) {
            tokens.add("--");
            break;
        }
        if ("-".equals(arguments[i]) || !arguments[i].startsWith("-")) {
            tokens.add(arguments[i]);
            continue;
        }
        if (options.getOption(arguments[i]) != null) {
            tokens.add(arguments[i]);
            continue;
        }
        if (options.getOption(arguments[i].substring(0, 2)) != null) {
            tokens.add(arguments[i].substring(0, 2));
            tokens.add(arguments[i].substring(2));
            continue;
        }
        // No option found
        tokens.add(arguments[i]);
        if (stopAtNonOption)
            break;
        continue;
    }
    for (; i < arguments.length; i++)
        tokens.add(arguments[i]);
    return tokens.toArray(new String[tokens.size()]);
}

[edit] Inlining

Observe now that the function member init is private and is only used in function flatten. The function is so short that it merits inlining.

Similarly, the field tokens is private and is only used in flatten. In the sake of minimizing variables scope, we should turn it into into a local variable. In doing that, we find that function init is not necessary any more. Also, since tokens is the returned value we name it $.

Applying these changes reduces the total token count of the class from 298 to 280. However, in function flatten the number of tokens increases from 295 to 277 (the number of lines does not change).

Function flatten has 10 more tokens (254 instead of 244), but the same number of lines

protected String[] flatten(Options options, String[] arguments, boolean stopAtNonOption) {
    ArrayList<String> tokens = new ArrayList<String>();
    int i;
    for (i = 0; i < arguments.length; i++) {
        if ("--".equals(arguments[i])) {
            tokens.add("--");
            break;
        }
        if ("-".equals(arguments[i]) || !arguments[i].startsWith("-")) {
            tokens.add(arguments[i]);
            continue;
        }
        if (options.getOption(arguments[i]) != null) {
            tokens.add(arguments[i]);
            continue;
        }
        // No option found
        if (options.getOption(arguments[i].substring(0, 2)) != null) {
            tokens.add(arguments[i].substring(0, 2));
            tokens.add(arguments[i].substring(2));
            continue;
        }
        tokens.add(arguments[i]);
        if (stopAtNonOption)
            break;
        continue;
    }
    for (; i < arguments.length; i++)
        tokens.add(arguments[i]);
    return tokens.toArray(new String[tokens.size()]);
}

[edit] Modular Decomposition

This last version of the code makes it easy to decompose the function into two. The auxiliary function, flattenArgument marked as private handles a single argument.

Function flattenArgument returns true if and only if the rest of the arguments are to be eaten. Therefore, every break; statement translates into return true;, and every continue; statement translates into return false;. In particular

if (stopAtNonOption)
    break;
continue;

translates to

if (stopAtNonOption)
    return true;
return false;

to which the application of boolean condensation achieves

return stopAtNonOption;

The result is

protected String[] flatten(Options options, String[] arguments, boolean stopAtNonOption) {
    ArrayList<String> $ = new ArrayList<String>();
    for (int i = 0; i < arguments.length; i++)
        if (flattenArgument(options, arguments[i], stopAtNonOption, $))
            for (++i; i < arguments.length; i++)
                $.add(arguments[i]);
    return $.toArray(new String[$.size()]);
}
 
private boolean flattenArgument(Options options, String argument, boolean stopAtNonOption, ArrayList<String> tokens) {
    if ("--".equals(argument)) {
        tokens.add("--");
        return true;
    }
    if ("-".equals(argument) || !argument.startsWith("-")) {
        tokens.add(argument);
        return false;
    }
    if (options.getOption(argument) != null) {
        tokens.add(argument);
        return false;
    }
    // this is not an Option
    if (os.getOption(argument.substring(0, 2)) != null) {
        tokens.add(argument.substring(0, 2));
        tokens.add(argument.substring(2));
        return false;
    }
    tokens.add(argument);
    return stopAtNonOption;
}


This version has in total 30 lines of code (26% of the original 116) and 258 tokens (48% of the original 532). Also the Semicolons count was reduced by half, from 36 to 18.

Note also the reduction in the number of variables. The original version of flatten used a total of 6 variables: 1 field (tokens) and 5 local variables (eatTheRest, currentOption, i, option, and specialOption). This last version uses only 2: i and $.

A couple of touch up rewrites remain:

  • Add the @Override annotation to reduce the number of compiler warnings.
  • Reducing the variability of variables by adding final modifiers wherever possible (as it turns out, all variables and all arguments can be declared final here)
  • Use of Generic names wherever possible
@Override protected String[] flatten(final Options os, String[] arguments, boolean stopAtNonOption) {
    final ArrayList<String> $ = new ArrayList<String>();
    for (int i = 0; i < arguments.length; i++)
        if (flattenArgument(os, arguments[i], stopAtNonOption, $))
            for (++i; i < arguments.length; i++)
                $.add(arguments[i]);
    return $.toArray(new String[$.size()]);
}
 
private boolean flattenArgument(final Options os, final String argument, final boolean stopAtNonOption, final ArrayList<String> tokens) {
    if ("--".equals(argument)) {
        tokens.add("--");
        return true;
    }
    if ("-".equals(argument) || !argument.startsWith("-")) {
        tokens.add(argument);
        return false;
    }
    if (os.getOption(argument) != null) {
        tokens.add(argument);
        return false;
    }
    // this is not an Option
    if (os.getOption(argument.substring(0, 2)) != null) {
        tokens.add(argument.substring(0, 2));
        tokens.add(argument.substring(2));
        return false;
    }
    tokens.add(argument);
    return stopAtNonOption;
}
Personal tools