Not My Problem. Nevermind, My Bad.

I have lately been working with GraalVM to use Polyglot for embedding Python code in Java. In doing so I sometimes require a translation of Java objects to Python objects and vice versa, while passing them between the two languages. My research advisor recently pointed out that my translation will fall apart if he passed a self-referential object - like a list that contains itself. This is because I translate objects like lists by first creating a fresh list in Python and then recursively translating the elements of the list and appending them to the Python list. If encountered with a self-referential list, this process would lead to a RecursionError. However this is a classic problem and has a simple fix - we keep a track of objects that have already been translated and reuse translations if the object is encountered again.

For the specific case of translating lists, the original program would look something like this (note that this is not actual code).

def translate(java_object):
    # ...handle other kind of objects

    # handle lists
    if java_object.is_java_list():
        new_list = []
        for element in java_object:
            new_list.append(translate(element))
        return new_list

The fix would modify the code in the following way.

def translate(java_object, translated_objects=None):
    if translated_objects is None:
        translated_objects = {}
        
    if id(java_object) in translated_objects:
        return translated_objects[id(java_object)]

    # ...handle other kind of objects

    # handle lists
    if java_object.is_java_list():
        new_list = []
        translated_objects[id(java_object)] = new_list
        for element in java_object:
            new_list.append(translate(element, translated_objects))
        return new_list

Simple, right? Note that java_object is of foreign type when passed to Python. Since I was in a “debug mode” my actual function printed java_object before proceeding to the rest of the function body.

def translate(java_object, translated_objects=None):
    print(f"translating {java_object} with id {id(java_object)}")

    # ...rest of the function

Everything was good, until I passed a recursive list to the function - and all I saw was a RecursionError. Polyglot doesn’t provide a very detailed stacktrace for errors arising from Python, and all I saw was that the error came from translate. But print was not executed, like it usually would while working with other objects. I confirmed that the statement before invoking translate was being executed, and inferred that the problem must be in the internals of Polyglot - maybe Polyglot did not know how to pass recursive structures to Python (possibly because it could not “wrap” it up in the foreign type). GraalVM has limited documentation, and a PhD student who shares the office with me suggested that I check GraalPython’s source code to find some implementation details. The code was understandably (small team of researchers, new project, etc.) not very well documented, and I could not find anything useful. I told my advisor that this is not my problem and maybe we should raise a bug report with the GraalVM team (which I already did informally).

Satisfied, I went back to my next task. But while scrolling through the translate function, I realized to my horror that my code actually looks like this.

def translate(java_object, translated_objects=None):
    if translated_objects is None:
        translated_objects = {}
        
    if id(java_object) in translated_objects:
        return translated_objects[id(java_object)]

    # ...handle other kind of objects

    # handle lists
    if java_object.is_java_list():
        new_list = []
        for element in java_object:
            new_list.append(translate(element, translated_objects))

        # NOTICE THIS LINE ---------------------------v
        translated_objects[id(java_object)] = new_list
        # ---------------------------------------------
        
        return new_list

Yes, this would lead to a RecursionError because my inner objects would never see the translation of the outer list. I knew that even if this was the error, my print statement should have executed. But I nonetheless fixed the code and ran it again with mixed feelings of satisfaction and embarrassment. However the RecursionError persisted - and with the same observation. I was partly relieved that nobody would know about my bug, and that the actual error is still in the internals of Polyglot.

I called it a day and before making a commit, removed the print that I put when I started working in the morning. I ran all tests again (because why not?) and to my absolute surprise, everything compiled and passed! It turned out that the RecursonError was due to the print call itself! Java can print recursive lists by replacing self-references with a placeholder (this Collection), while Python does the same with [...]. However the __format__ (and even __str__ and __repr__ – although I don’t think all three of these would even have been implemented) method on foreign does not properly handle recursive structures and that is where the RecursionError was coming from (evaluation of the f-string invokes the __format__ method).

So I did find a bug in Polyglot, but not one that had anything to do (I am not going to print foreign Java lists in Python) with what I was doing. The bug in what I was doing was really mine, but I never really hit it (and nobody else would have known about it) - so was it really a bug? Let’s call it a minor typo. And of course I have fixed my bug report to the GraalVM team.