"There is always speculation about how generative AI could influence the future of various professions. Software engineering is no exception.
On February 21, 2024, we read the headline in this newspaper that "programmers have had their day," although to be honest, the article has a completely different thrust. On April 5, we read about "App developers without programming knowledge," how a tech-loving father with no special prior knowledge developed an app for his iWatch that he can use to record the score as a referee.
What was not mentioned was that similar apps and their source code are already available on the Internet, so the generalizability of the creative performance of the AI used is at least questionable. Such headlines and articles give the impression that software engineering can be done by machines in the medium term and that the growing shortage of skilled workers in the key discipline of digital transformation may resolve itself.
The opposite is the case. In this article, we reflect on the essence of software engineering in order to work out the need for a human "in the driver's seat". We do not want to just repeat the truism that is also often proclaimed for other professions that AI will assist but not replace it. Instead, we want to specifically answer the question of why this is the case. We suspect that central ideas of this article can also be transferred to other professions.
Software engineering
Software engineers build software systems. The intellectual and creative achievement consists in recognizing and understanding needs and converting them into a software system that addresses this need. This involves translating phenomena, structures and challenges of the real physical world into virtual technical software structures.
Necessary for this process are the ability to recognize the need, an understanding of the technical domain and the specific context of the application and knowledge of the software engineering domain. Software engineers must also be able to penetrate and precisely articulate complex relationships in such a way that they are accessible to software engineering treatment. This is commonly referred to as the ability to model or abstract in the technical and business domains.
The step from need to IT system is usually too big to be taken directly. This is why software engineering has established intermediate steps, at least conceptually. These intermediate steps can be understood as transitions from actual needs to written user requirements, as transitions from user to system requirements, as transitions from system requirements to system designs, and finally as transitions from designs to executable code. Experience has shown that all of these transitions are highly error-prone.
A misunderstood need quickly leads to inadequate user requirements and ultimately to a software system that solves the wrong problem. It makes a big difference whether the actual need is to get to the supermarket without a car (one solution: bicycle) or whether the need is to get groceries from the supermarket to your home (another solution: delivery service). In a similar way, user requirements can easily be translated into inappropriate system requirements. For example, the requirement to allow thousands of students to submit their assignments electronically might lead to the assumption that a system with redundant servers is necessary. However, a simpler solution is to set the deadline for Sunday morning at 3 a.m. to avoid overload. And so on: system requirements can result in designs that are not scalable later or cause security problems. Implementing these designs in code can also be error-prone.
It depends on the domain and the software development process chosen whether the artifacts mentioned are even explicitly created. In agile software development, for example, many of the activities mentioned are merged. Regardless of the development process, software development consists of countless decisions and choices between possible options, which also includes identifying and resolving trade-offs.
This process is made more difficult by the fact that it takes place in a dynamic environment. Needs, market conditions, competitive environment, regulations and technical infrastructure are constantly changing. In contrast to hardware, software enables such constant adjustments due to its immaterial nature, a property that makes it the key driver of innovation. Software development is therefore not a linear process, which is explains, among other things, the attractiveness of agile development methods.
There are recurring elements in the development process. Many sub-problems and their solutions are already known in both the business and technical domains. This knowledge manifests itself in empirical knowledge, structural patterns for software, specialized programming languages or libraries that enable the reuse of functionality. However, there is always a real risk of confusing the current problem with a similar but not identical problem, which can lead to an unsuitable solution. The development of software is highly context-dependent, which is why the reuse of problem descriptions and solutions is not always obvious.
In addition to the ability to create models, other skills are crucial in software engineering: communication, recognizing conflicting goals and resolving them, knowledge of recurring problem and solution building blocks and the ability to sense when these building blocks do not fit the current context. In addition, there is the implementation of the abstract models in technical code. This implementation still requires algorithmic thinking, i.e. the ability to break down a desired functionality into a series of individual steps that can be efficiently executed on a machine. Finally, software engineers must be able to understand and assess existing requirements documents, architectures or code if their job is to maintain and further develop existing code, some of which is decades old.
Generative AI
Put in this way, it seems obvious that many of the tasks described cannot in principle be taken over by an AI. The reason for this is that these tasks require decisions that can only be made on the basis of a deep understanding of the specific needs and the differences between possible solutions. Our central argument is that these needs cannot be "guessed" by an AI. Instead, it is essential that a human gradually learns more about the actual desired functionality of the system to be developed in an iterative process and understands conflicting goals, as is practiced in agile software development. Ultimately, someone has to be able to express exactly what the users want in a form that is usable by software. How can an AI be able to do that ex nihilo?
Many of us have difficulty dictating emails. Instead, we start writing, correct, take notes, identify flaws in our argumentation and change our tone. We think as we write, as Kleist described in his essay on the "gradual formation of thoughts while speaking". The development of code is subject to a similar process, and so is the development of other software engineering artifacts. This iterative process works precisely because it is slow and because we can directly check what has just been written/said/coded/specified and correct it if necessary: we feel our way towards an acceptable state.
The idea that we would be able to say completely, consistently and without beating around the bush what the desired functionality is for even a simple system has proven to be unrealistic. Such one-shot generation of code therefore seems impossible in most cases. This is not so much true because the translation into code might not be technically possible, but because the input is usually too poor for generative AI.
This inadequacy results from incompleteness, vagueness, inconsistencies and misunderstandings of the actual need. Incompleteness can be statistically repaired both at the level of the problem description, i.e. the specification or, in the case of generative AI, the prompt, and at the level of the generated code: An AI fills the gaps with the most probable, i.e. most frequently occurring information, based on the training data. This may be appropriate if the problem is actually a standard problem in the domain. But it can also be completely inappropriate.
One argument in this context is that software engineers in the role of programmers today often do not act any differently even without AI: solutions offered on the Internet in forums such as Stack Overflow are often incorporated into their own code. However, we know that in the area of safety-critical software components, for example, this often leads to insecure code because the context of use is slightly different and because the subtleties of the copied code are not understood.
Vagueness is often linguistic in nature, but sometimes also conceptual. This is often the case when knowledge of the technical domain is left implicit or incorrectly formulated. One of the great strengths of generative AI is that it makes this context knowledge available implicitly. But whether this context knowledge is always correct, complete and adequate for the application in question can only be answered statistically: in the most typical case, perhaps!
The resolution of incompleteness and vagueness can therefore take place directly when an artifact is created if the generative AI simply selects the "more typical" case. But the resolution can also take place as part of another workflow: the AI points out suspected vagueness and incompleteness and suggests improvements based on the training data. This is where we see the real potential of using generative AI in software engineering.
In any case, a software engineer will have to check and assess whether the artifact created by the generative AI is technically and professionally appropriate. As a rule, this will lead to an iterative process of creating and checking, just as it would be without AI. Then the question arises whether it is not more cost-effective to write the code or specification directly yourself, without the help of AI. It depends: For complex, specific requirements, manual creation may be more efficient, while AI can generate standard code with high accuracy. The same applies to simple and more complex emails.
This could be a crucial difference between software engineering and other professions or areas of life. In co-creation with generative AI, we obviously need a complex oracle that can assess the correctness, completeness, and adequacy of a generated artifact: the human. For love letters, greetings, texts on social media, images of catalog models, and advertising videos, the oracle's task is comparatively simple: you can often see immediately whether the artifact meets the usually implicit requirements. For more complex artifacts, such as code for problems that have not already been solved hundreds of times, it seems possible and even likely that the effort of iterative creation and, above all, checking becomes prohibitive or simply not enjoyable enough.
Conclusion
Finally, three considerations.
First, our perspective should not give the impression that generative AI is not useful in software engineering. On the contrary, we see great potential for certain programming languages and tasks such as explaining code, creating comments, finding suitable library functions and creating standard code. We continue to assume that generative AI will significantly improve the process of requirements engineering, i.e. those activities that deal with needs and requirements. However, we have tried to clarify where the fundamental limits of generative AI in software engineering in general lie - which does not mean that there cannot be excellent opportunities for very specific, well-understood and ultimately uncritical contexts to have software created by generative AI.
Second, we should keep the expectations of generative AI in software engineering realistic. Anyone who expects AI to always create correct, complete and adequate software will be disappointed. As in other professions, there are software engineers with different skills. In this paper, we have not differentiated between experienced software engineers and beginners. Even today, products of less capable or experienced engineers are sometimes software systems that do not always offer the functionality with the reliability that they should. A logical follow-up question is then whether generative AI can perhaps be as good as bad software engineers. Maybe it can.
Our central argument about the difficulty of formulating a prompt that contains an adequate specification remains unaffected.
Third, we need to consider whether and how we should adapt the training of software engineers. The author is of the opinion that, surprisingly, there does not need to be any fundamental changes in basic training: in order to be able to assess software engineering artifacts, you first have to be able to develop them yourself. Perhaps this is also a difference to other disciplines. So if we strengthen the ability to evaluate code (testing, reading techniques) a little more in basic training and continue to train the discussed ability to build models, the software engineers of tomorrow will be able to meet the challenges of tomorrow with the support of AI. Once the foundations have been laid, software engineering prompting can and should be taught. But it seems to us that the premature use of AI assistants for code generation is detrimental to the development of judgment." [1]
1. KI wird Softwareingenieure nicht ersetzen! Frankfurter Allgemeine Zeitung (online) Frankfurter Allgemeine Zeitung GmbH. Sep 11, 2024. Von Alexander Pretschner
Komentarų nėra:
Rašyti komentarą