August 17, 2021

OpenAI’s Codex Turns Natural Language to Computer Code

Written by

OpenAI announced August 10 an improved version of Codex, a machine learning tool that translates natural language into code and is available for private beta testers through an API. It’s able to understand more than a dozen programming languages and can interpret commands in plain English and execute them, allowing users to build a natural language interface for existing applications.

OpenAI Codex turns natural language to code

If you remember, OpenAI used an earlier version of Codex to build GitHub Copilot, an AI pair programmer that lives inside the Microsoft Visual Studio (VS) Code editor and autocompletes code snippets. Copilot was announced only a couple months ago, but Codex has evolved from completing code to creating it.

Codex is a descendant of OpenAI’s text generating model GPT-3, which was released last summer. GPT-3 was trained on a huge quantity of language data taken from the internet, meaning it can read and then complete text prompts submitted by a human user.

Codex has much of the natural language understanding of GPT-3 but with the addition of being trained on billions of lines of publicly available computer code. Although Codex is initially being released as a free API, OpenAI will start charging for access at some point in the future.

Read also: The Impact of AI on Cybersecurity: A Q&A with Deep Instinct

How does OpenAI Codex work?

The software can be used to build simple websites and basic games. Users type English commands into the software such as, “add this image to the webpage in the upper right corner. Make the image smaller and circular,” and Codex translates it into code.

OpenAI codex

Codex creates this code through its training on all the public code available within GitHub and other databases and it’s own knowledge (several kilobytes’ worth) of coding context. It can also create code for the generalities embedded in the code. How would the tool know what a helicopter is? Even if the user didn’t define it, it can guess at what the object is from other uses and context.

The API requires thought and some trial and error to use. It won’t turn non-coders into expert programmers, although it’s certainly a step in the right direction. It takes some patience to operate, but for those who are already programmers, it removes the drudge work, making coding faster and more accessible.

OpenAi turns natural language into code

Problems within the tool and community

Of course, just like with GitHub Copilot, the tool isn’t perfect. A recent paper published by OpenAI discusses the possibility that Codex might have significant limitations, including biases and sample inefficiencies.

Like other large language models, Codex generates responses as close as possible to its training data, a process that can look good on inspection but in reality produce something undesirable. The researchers at OpenAI state:

When we probed using the prompt def race(x):, we found that many of the most commonly-generated completions assumed a small number of mutually exclusive race categories. Most synthesized completions included “White” and many included only a few other categories, followed by “other.” Several synthesized generations included only 3 categories: “white,” “black,” or “none.”

OpenAI asserted in the paper that risks can be mitigated with “careful documentation and user interface design, code review requirements, and/or content controls.” It’s your usual case of don’t take your hands off the robot steering wheel just yet.

The other issue stems from coders raising that OpenAI is profiting unfairly from others’ publicly available work. For example, GitHub Copilot’s knowledge base is mainly made up of code written by others and ultimately completes code via its library of open-source work originally created to benefit individuals. The same can be said of Codex, but OpenAI says this data is legally protected under fair use.

Is no-code the future?

While it’s unlikely that no-code will ever completely replace coding and traditional app development, tools such as Codex lower the barrier to entry by making the process easier and more accessible.

Developers will always be in demand, but anyone with a good idea can use a low-to-no-code tool to execute it on their own. More ideas can flow, leading to more innovation. 

All images are via OpenAI