AI creates software with dangerous shortcomings

Developer are using AI more and more to perform their programming work. Doubts about the quality of code AI magically presents on your screen are rising. Those doubts are justified. Because there's no such thing as wizardry and fairy tales. AI is more of a code breathing dragon. An increasing number of research papers indicate al sorts of flaws in the code that AI 'creates'.

The consequences

It seems wonderful, performing tasks in seconds that used to take days or even weeks. I don't want to claim that AI brings only misery, but forewarned is forearmed. I've been experimenting with AI and it's basically useful. But ... I see a lot of things go wrong. In the meantime a lot of research has been done to uncover all the flaws in the generated code.

De shortcomings in AI generated code are mostly in four categories:

Efficiency and performance
Security
Maintainability
Functional correctness

And just happen the four most important qualities of software. Ouch!

Inefficient software is expensive

Generating code with AI appears to be cheap. And lets be honest it is. But everyone knows you can't have a champagne lifestyle for beer money. Because if that code then requires much more resources to run, your profit evaporates quickly.

Some code AI generates is factors more inefficient than the code of a good programmer. For AI generated software you need 2, 3 or sometimes even 100 times as much hardware to run it. Especially in the cloud that give a high burn rate of your budget.

AI knows nothing about scalability. The number of users, database size and database growth. Or concurrency. Or network latency. All things that factor into the efficiency of your final production environment and it's cost.

Security

The next problem is ... unsafe code. Dit the, or better yet your programmer stop to think about that?Possibly not. And we're talking major risks here! Insecure code allows hackers with bad intentions to penetrate your systems. Resulting in data made public, data loss, ransomware and a bad reputation. They're all expensive, and sometimes make you go bankrupt. You want to avoid that at all cost.

Maintainability

Standardization is a sensitive subject at best ... and code standardization is almost non existent in AI generated code. You want basic code standards like naming of variables and functions, parentheses placement and how and where you indent. Because AI has many examples using different standards, the generated code will use a different standard for different pieces of the code. That reduces readability and thus the maintainability.

Developers that stop writing code lose their proficiency. Writing code is not like riding a bicycle, your subconscious is the one riding the bicycle. Writing software is an intellectual activity that requires skills that you lose if you don't train them. Maintaining code is a different ballgame than generating code, for maintenance you need to understand what the code does. And why it's failing. And where the leakage is. Ai does not have those analytic skills, but developers lose their proficiency (partly) if they only produce little code themselves.

Functional correctness

AI generates code based on examples. Sure, a great many examples. But AI doesn't understand the problem you're trying to solve with a piece of software. AI associates the question with code where a similar question was asked and assumes that tha code will solve the problem. Wether it actually does that is anybodies guess.

For simple assignments it will mostly be correct. But if different parts of the code are related, but generated separately, who will oversee the coherence is correct and there are no side effects? Ai has been insufficiently trained for complex code or code for very specific purposes.

AI doesn't know it's own limitations

I've never seen an AI answer with 'I don't know'. Ai ALWAYS gives you an answer. How good or bad that answe is is left up to you. When I interview a job candidate I consider it to be a sign of strength if they honestly tell me they don't know something. Know your limitations and ask for help on time. But AI would rather give you a wrong answer than no answer at all. And that shows in the quality of Ai generated code.

AI is a regular chatterbox

De current generation AI is based o Large Language Models (LLM). An LLM is fed with an enormous amount of language bits and statistically computes what language part goes with what other part. Of course it's more complicated than that, but that's the basis. If someone asks you where you live, your subconscious automatically comes up with the name of your home town, you don't have to think about it. That's the way AI comes up with it's answers. That makes it appear to be a smooth talker. The term Artificial Intelligence is actually wrong, it has no intelligence like we humans do. AI is incapable of original thought and has no consciousness, even if appearances are different.

AI can make correlations lightning fast. But they are statistical correlations, there's no causality. There is a statistical correlation between shoe size and learning achievement in kids. But buying your kids bigger shoes will not improve their grades, it will just hamper their walking. The correlation is caused that kids grow as they get older, causing them to have bigger shoes, and they've been in school longer which increase their knowledge. But Ai can't distinguish the difference.

Garbage in, garbage out

AI only knows what it's been told. Literally, AI can only answer based on it's training data. Ai will believe everything you put in, just like our subconscious. Additionally, it believes what it hears more often more that that which it hears less often.

Top programmers are few and far between. Only 2% of the population is highly gifted. Top programmers are probably overrepresented in those 2%, but the number of mediocre programmers is much higher than the number of top programmers. That doesn't need to be a problem as long as you're aware of it.

So the sites where code generating AI is trained contain much more code from mediocre programmers than code from super programmers. It's easy to guess which code AI will see as truth. Ai generates mediocre code because it's seen more mediocre examples than really good examples.

As more code gets generated by AI there will be more AI generated code on code websites like GitHub, Stack Overflow and The Code Project. That in turn will serve as input for those same AI's. Ai will get additional training from it's own output. That results in an information bubble for textual AI's, see also the blog I wrote earlier. (In Dutch, Google translate is your friend).

What AI is good in, is nothing new

AI is good in performing repetitive tasks. Maintenance forms and boiler plate code you use often. Except you don't need AI for that. We've been doing that for decades with templates and code generators. And fourth generation languages, low code, no code, the list is long. The advantage of those solutions is that they always give you the same result; AI tends to get 'creative' and give you different answers to the same question. Older tooling may get you better results than AI.

AI can deliver a higher productivity when used correctly. Just like the tractor and combine harvester made farmers much more productive, it can make programmers carry a much heavier load. If you drive a tractor into a wall at full speed it ends badly. For you, the tractor and the wall. Ai is also a piece of complicated tooling that needs to be handled with knowledge and skill. In untrained hands it's just as dangerous as a tractor.

Not when lives are at stake

For software where lives are at stake we should avoid code generating AI. Things like heart monitors, aviation in general and nuclear power plants. We don't want airplanes dropping from the sky because the code in the avionics isn't good enough. We have too little experience with code generation results to trust our lives with it.

And how good is your AI generated software?

Of course you're asking how good and dependable your AI generated software is. To find out you'll make a 15 minute appointment with me. Then you'll know where you stand and what needs fixing.

Make your appointment here

Click me

A lot has been written about bthis subject recently. New research is published weekly. Here are some places to start for the substantiation of this blog:

https://arc.dev/talent-blog/impact-of-ai-on-code/

https://blog.codacy.com/we-analyzed-ai-generated-code

https://www.wired.com/story/fast-forward-power-danger-ai-generated-code/