Statistics

Problem Statement for "Indentation"

Problem Statement

Some programming languages are beginning to insist on proper indentation to indicate block structure. Examples include Python and Haskell. Such languages can dispense with traditional block delimiters such as curly braces or explicit begin/end keywords. In this problem, we will consider a hypothetical language in this family.

Parsing occurs in two phases. In the first phase, we remove comments and convert physical lines of text (ie, the lines as written by the programmer) into logical lines of text (ie, the lines that reflect the intended program structure). In the absence of comments, physical lines and logical lines are identical. Ordinary comments begin with a single '#' character and extend to the end of the line. The '#' character and all the characters that follow are deleted from the physical line, and the remaining characters become a logical line. Special comments, called line-joining comments, begin with two or more consecutive '#' characters. Line-joining comments indicate that multiple physical lines are to be combined into a single logical line. When a line-joining comment is encountered, the '#' characters and all the characters that follow are deleted from the physical line, just like an ordinary comment. However, the remaining characters do not form a logical line by themselves. Instead, they are prepended to the beginning of the next physical line, which is then processed as usual. A line-joining comment may not appear in the last physical line of the program. For example, the five physical lines

    "a commented line # an ordinary comment"
    " beginning## a line-joining comment"
    " middle #### another line-joining comment"
    "end # this is not ## a line-joining comment"
    "  an uncommented line"
would be converted into the three logical lines
    "a commented line "
    " beginning middle end "
    "  an uncommented line"
Notice in the above example that line-joining comments may begin with more than two '#' characters (see the third physical line) and that, once a comment has begun, further '#' characters are ignored (see the fourth physical line).

In the second phase of parsing, we recover the block structure of the program by analyzing the indentation of the logical lines. The indentation of a logical line is measured as the number of spaces that precede the first non-space character. A logical line that contains no non-space characters is called a blank line and is ignored when determining block structure. Indentation provides the clues we need to decide where each block opens and closes.

  • The first non-blank logical line opens a new block.
  • If a non-blank logical line is indented more than the previous non-blank logical line, it opens a new block.
  • If a non-blank logical line is indented the same as the previous non-blank logical line, it continues the same block as that line.
  • If a non-blank logical line is indented less than the previous non-blank logical line, it closes any open blocks that were indented more than the current line, and continues whichever open block was indented the same as the current line. If no such block exists, it is a syntax error (see Example 2).
  • After the last logical line, all outstanding blocks are closed.
For example, in the program

  "function fact n is"
  "  if n equals 0 then"
  "    return 0"
  "  else"
  "    return n times fact n minus 1"
there are four blocks:
  • The first begins on line 1 and ends just after line 5.
  • The second begins on line 2 and ends just after line 5.
  • The third begins on line 3 and ends just before line 4.
  • The fourth begins on line 5 and ends just after line 5.

Create a class Indentation containing a method countBlocks that takes a program and returns the number of distinct blocks occurring in that program, or -1 if the program contains a syntax error. A program will be represented as a String[], where each element represents a single physical line of text. Allowable characters include letters, digits, spaces, and '#' characters.

Definition

Class:
Indentation
Method:
countBlocks
Parameters:
String[]
Returns:
int
Method signature:
int countBlocks(String[] lines)
(be sure your method is public)

Constraints

  • lines contains between 1 and 50 elements, inclusive.
  • Each element of lines contains between 0 and 50 characters, inclusive.
  • Allowable characters are letters ('a'-'z' and 'A'-'Z'), digits ('0'-'9'), spaces (' '), and '#' characters.
  • The last line does not end in a line-joining comment.

Examples

  1. { "a", "b" }

    Returns: 1

  2. { "a", " b", " c", "d", " e"}

    Returns: 4

  3. { "function fact n is", " if n equals 0 then", " return 0", " else", " return n times fact n minus 1" }

    Returns: 4

    The example above.

  4. { "function gcd of a and b is", " # greatest common divisor", " if b is 0 then ## base case", " return a", " else return gcd b and a mod b" }

    Returns: 2

    Line 2 is blank, and lines 3 and 4 are joined.

  5. { " while neither current nor parent is null", " grandparent is parent of parent", " set parent of current to grandparent", " set current to grandparent" }

    Returns: -1

    A syntax error. The last line is indented improperly.

  6. { " # comments cannot be nested ##", "a", "b" }

    Returns: 1

  7. { " test multiline comments", " ## space 1", " ## space 2", "second line in block" }

    Returns: 1

  8. { "# this program", "## is completely", "# empty" }

    Returns: 0

  9. { " first line", "", " second line" }

    Returns: 1

  10. { "when you outdent", " be sure to", " outdent far enough" }

    Returns: -1

  11. { " make sure", " to check for", "an empty stack" }

    Returns: -1

  12. { "a b c d e f g h i j k l m n o p q r s t u v w x y ", " a b c d e f g h i j k l m n o p q r s t u v w x y", " b c d e f g h i j k l m n o p q r s t u v w x y ", " b c d e f g h i j k l m n o p q r s t u v w x y", " c d e f g h i j k l m n o p q r s t u v w x y ", " c d e f g h i j k l m n o p q r s t u v w x y", " d e f g h i j k l m n o p q r s t u v w x y ", " d e f g h i j k l m n o p q r s t u v w x y", " e f g h i j k l m n o p q r s t u v w x y ", " e f g h i j k l m n o p q r s t u v w x y", " f g h i j k l m n o p q r s t u v w x y ", " f g h i j k l m n o p q r s t u v w x y", " g h i j k l m n o p q r s t u v w x y ", " g h i j k l m n o p q r s t u v w x y", " h i j k l m n o p q r s t u v w x y ", " h i j k l m n o p q r s t u v w x y", " i j k l m n o p q r s t u v w x y ", " i j k l m n o p q r s t u v w x y", " j k l m n o p q r s t u v w x y ", " j k l m n o p q r s t u v w x y", " k l m n o p q r s t u v w x y ", " k l m n o p q r s t u v w x y", " l m n o p q r s t u v w x y ", " l m n o p q r s t u v w x y", " m n o p q r s t u v w x y ", " m n o p q r s t u v w x y", " n o p q r s t u v w x y ", " n o p q r s t u v w x y", " o p q r s t u v w x y ", " o p q r s t u v w x y", " p q r s t u v w x y ", " p q r s t u v w x y", " q r s t u v w x y ", " q r s t u v w x y", " r s t u v w x y ", " r s t u v w x y", " s t u v w x y ", " s t u v w x y", " t u v w x y ", " t u v w x y", " u v w x y ", " u v w x y", " v w x y ", " v w x y", " w x y ", " w x y", " x y ", " x y", " y ", " y" }

    Returns: 50

    Maximum number of blocks.

  13. { "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwx##", "abcdefghijklmnopqrstuvwxabcdefghijklmnopqrstuvwxyz" }

    Returns: 1

    50 lines all joined.

  14. { " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " ##", " A" }

    Returns: 1

    Maximum possible indentation.

  15. { "a", " b", "c ", " d", "e" }

    Returns: 3

  16. { " a ## ", "", " b #", " b c", " b d", " e", " a ## b", " a", " a", " f", " a", " f" }

    Returns: 5

  17. { "a ######## comment with lots of marks", " b", " c ### a single comment better not # oining off", " d" }

    Returns: 2

  18. { "two # single # comments", " do not make a linejoining comment" }

    Returns: 2

  19. { "another", " ## ", " ", " empty line test" }

    Returns: 2

  20. { " ## ", " a", " b", "## ", " ## ", "##", " ##", " x" }

    Returns: 1

  21. { "any extra spaces inside the line", " do not make a difference" }

    Returns: 2

  22. { "a b c", " b c", "a b c" }

    Returns: 2

  23. { " while neither current nor parent is null", " grandparent is parent of parent", " set parent of current to grandparent", " set current to grandparent" }

    Returns: -1

  24. { "this", " is a # test exploting the ## order", " of precedence # for ## quite a while", " as you shall see", "test over" }

    Returns: 4

  25. { "a # ##a", " b" }

    Returns: 2

  26. { "function gcd of a and b is", " # greatest common divisor", " if b is 0 then ## base case", " return a", " else return gcd b and a mod b" }

    Returns: 2

  27. { "a", " a", " a", "a", " a" }

    Returns: 4

  28. { "a # a", " b## bbb", " c #### cccc", "end # this is your ## error", " ddd" }

    Returns: 3

  29. { "a", " b", "c", " d", " e" }

    Returns: -1

  30. { "#" }

    Returns: 0

  31. { "abc#" }

    Returns: 1

  32. { " #no code here" }

    Returns: 0

  33. { " fdgdfgdfgdfg", " dfgdfgdfgdfg", " dfgdfgdfgdfgdfg", " dhgdfh" }

    Returns: -1

  34. { " a", " b", " c", " d" }

    Returns: 3

  35. { "##abcdefgh", "##abbcdfhjls", "abc#####uihhh##", " ##fghdijk", " abcd#jdkls", " #djkss", " s", " s", " r", " h", " f", " #gggggggg", " ##jkljklj", " hjkjhklj#jkljlkkj##lkjljlj#kj", " #jklj", " j", " k", " s", " d", " d", " d", " d", " f", " g", " d", " g", " d", " d" }

    Returns: -1

  36. { " this is a test", " of nothing", " ##", " ##", " ##", "hello dolly", " ##", " # ##", " ##", " another test" }

    Returns: 2

  37. { "a", " # ##abc", " a" }

    Returns: 2

  38. { "this is # another ## challenge", " exploiting # the use ## of", " precedence # that does not ## seem to be", " implemented correctly" }

    Returns: 4

  39. { " blah blah blah blah # blah blah", " blah blah blah", " bling bloh blong", " dobby dubby" }

    Returns: -1

  40. { " kdjfakl", " 4kljlk##", "jklafd", " kjkldf", " 5klnlanfl", " dfadfa" }

    Returns: 3

  41. { "a", " b# #", " f" }

    Returns: 3

  42. { " buy", " here", " pay", "here" }

    Returns: -1

  43. { "hi#" }

    Returns: 1

  44. { " test", "should fail" }

    Returns: -1

  45. { " a", "b" }

    Returns: -1

  46. { " #", " #" }

    Returns: 0

  47. { "a", " b", " c", " b", " d", " b", "a" }

    Returns: 4


This problem statement is the exclusive and proprietary property of TopCoder, Inc. Any unauthorized use or reproduction of this information without the prior written consent of TopCoder, Inc. is strictly prohibited. (c)2024, TopCoder, Inc. All rights reserved.
This problem was used for: