I spent some time this week playing with GitHub Copilot. GitHub Copilot is “your AI pair programmer”. It provides autocomplete suggestions for your code as you’re writing it. It’s built on OpenAI’s GPT-3, a very impressive language model.
It has improved my productivity significantly. I used it while writing a client library for the ListenBrainz API. The ease with which it understands what you want to do and does it for you is impressive. Writing boilerplate code is a thing of the past now.
Here’s an example: the ListenBrainz API returns playlist objects in JSPF format as JSON. I wanted to parse the JSON and create wrapper classes for playlists and tracks. I put the JSON in a comment and wrote “class Track” and “class Playlist” in VSCode. Copilot did the rest. It understood the structure of the JSON from the docstring and then created a wrapper class out of thin air. There were a few things I would have done differently, but it was perfectly reasonable code. I won’t include the code in this post for brevity, but here’s a gist if you want to see what it came up with.
I wanted to see if it would be able to write more complicated code. I gave it some LeetCode problems, starting with the classic “Invert a Binary tree”. It suggested a correct solution, also noticeably more succinct than what I would have written.
def invertTree(self, root):
"""
:type root: TreeNode
:rtype: TreeNode
"""
if root:
root.left, root.right = invertTree(root.right), invertTree(root.left)
return root
It was also able to solve more complicated problems for topics like linked lists (problem, solution) and dynamic programming (problem, solution).
This makes me think about the point of LeetCode interviews. They were already non-ideal because of the low probability that most engineers would ever solve these problems in real life and GitHub Copilot makes them even more pointless. Any engineer with the right tooling can now solve these problems. What’s the point of asking people to do stuff that they can just get an AI to do?
The caveat is that you shouldn’t trust code that someone else (especially an AI) wrote and you don’t understand. But it shouldn’t be hard to understand the solution once it’s written in front of you.
There is also the argument that LeetCode interviews aren’t about engineering or algorithmic knowledge, they’re more about how much time the candidate is willing to dedicate to the company before getting selected. The interviews will favor the people who want it the most. If that’s the case, then LeetCode interviews are evergreen. But any organization trying to optimize for engineering skill should get rid of them.
(Thanks to Matt Stuchlik, Rohit Dandamudi and Mukesh Kharita for reviewing drafts of this post.)
I tried GitHub Copilot this weekend and then came to see if you'd written about it!
That's so true about interviews, I hadn't really thought about it since in a LeetCode interview you are fairly unsupervised.
Copilot "makes the easy things quick" in terms of the "got something wrong, google it, check stack overflow, read the answers, try it out, debug it" loop that software ends up in a lot of the time. It short circuits the simplest ones to where I don't even need to leave the editor, and the examples it produces are generally internally consistent (e.g. it hasn't thrown in jQuery or lodash in places where I'm not using those yet, while StackOverflow is often "Use this library!") Then when I get to the harder ones where I might need to do some deeper thinking or research, I have the time and energy.
That's what happened to me yesterday. I was refactoring my answer to Cryptopals 2.12 & 2.13, and it let me breeze through unit test writing and some array equality checking and decoding URI params, so when I hit some questions like "Is it a good idea to `extend Arrray`"? and wtf should 0x80 look like when you render it as a character? I wasn't already sick of googling around.
What's interesting is that what I really liked about Stripe's interviews when I interviewed (and other CoderPad interviews in general) when contrasted with the whiteboard interviews is that you actually had to write and execute actual code. Having been on both sides of those interviews and whiteboard interviews, it's been surprisingly informative to see how someone works through a problem when they need the code to execute. Loads of small things will come up: missing semicolon, mispelled variable name, String API misunderstandings. In interviewee mode, I was always embarrassed, like "oh jeez they'll never hire someone who makes a mistake like this", but in interviewer mode it was actually really useful when something small came up because there's a big split between calmly debugging and fixing it vs. it completely de-railing the thought process.
If Copilot becomes ubiquitous, it's true that there will be less information to gain from watching someone do that, but it hopefully will also mean that that part of the job is less critical to evaluate.
As in, pre-Copilot, trip-ups over simple things happen frequently. I may need to remind myself about something small like String API's or making a POST request . If I'm not good at quickly unblocking myself, then it's reasonable to assume that I'm going to repeatedly trip, and have difficulty making independent progress on a complex problem.
In a ubiquitous copilot world, being able to quickly unblock yourself on small things becomes both harder to measure but also less important since Copilot will do a lot of the simple things for you. So it would seem the interview questions will need to evolve to testing more complex problem solving (and probably also provide copilot to interviewees to practice with).
That became a real coffee-fueled wall of text 😅 How has your thinking evolved since you wrote this? Have you kept on using it beyond the free trial?