TLDR: This research paper by Wolfgang Eppler and Reinhard Heil from ITAS, KIT, examines the role of generative AI in Technology Assessment (TA). It highlights that while generative AI is used as a tool in TA, it is also a subject of TA research. The paper outlines the phenomena of generative AI, detailing its capabilities and significant limitations, such as generating misinformation, bias, and a lack of transparency. It then delves into eight structural causes of problems with current AI, including data quality issues, misalignment challenges, context-content conflicts, non-continuous learning, absence of social perspectives, inadequate world models, and poor reasoning abilities. Finally, the paper suggests that generative AI should be used cautiously in TA, primarily as an idea generator and support tool, with all outputs requiring thorough human verification. It emphasizes that human understanding remains irreplaceable and warns against over-reliance on AI.
Generative AI has rapidly become an integral part of scientific work, and the field of Technology Assessment (TA) is no exception. This paper explores the dual relationship TA has with generative AI: both as a tool to aid in TA work and as a subject of TA research itself. While generative AI offers impressive capabilities, it also presents significant challenges that stem from its fundamental design.
Understanding Generative AI
At its core, generative AI, often referred to as chatbots or large language models (LLMs), allows us to communicate with machines using natural language. Tools like ChatGPT, Gemini, Llama, or Claude are easy to use and can perform various tasks, from summarizing texts and answering questions to generating new content like texts, images, and even program code from simple prompts. More advanced versions, known as Large Action Models, can even act as agents, performing tasks like internet searches or processing company data.
However, it’s crucial to understand that LLMs are primarily language models, not knowledge models. They don’t recall learned facts but generate new texts based on statistical relationships identified during their training. This means their outputs, while often plausible, are not always factually correct. They can produce misinformation, disinformation, and what are known as ‘hallucinations,’ which undermines their trustworthiness. Furthermore, because training data often contains biased or toxic content, generative AI outputs can reflect and even amplify these biases. Efforts to correct these issues, known as ‘alignment,’ can sometimes introduce new distortions or lead to ‘catastrophic forgetting’ of previously learned information. The internal workings of these models also remain largely opaque, making it difficult to understand how specific outputs are generated.
Structural Challenges of Current AI
For TA, which often involves advising policymakers, reliable and understandable solutions are paramount. This requires AI outputs to be comprehensible, verifiable, controllable, coherent, non-discriminatory, and explainable, with clear disclosure of sources and data. The paper identifies eight structural causes for the problems observed in current AI:
- Data Quality: A vast amount of training data comes from the internet, often containing synthetic data, duplicates, and contradictions. Verifying the factual accuracy of such massive datasets is practically impossible. Data distribution is also uneven, leading to models generalizing incorrectly in new situations. The methods used by providers to clean and annotate data often remain undisclosed.
- Misalignment: Despite impressive language skills, chatbots can produce ethically questionable responses. Alignment techniques like fine-tuning, instruction tuning, and prompt extensions (e.g., Chain of Thought) aim to mitigate this. However, these methods don’t fully solve reliability issues and can even worsen them by overwriting learned content or introducing new biases. This is partly due to the ‘proxy effect,’ where AI uses different, often irrelevant, features than humans to form categories.
- Context-Content Challenge: There’s a tension between user input (context) and the model’s pre-trained knowledge. It’s often unpredictable which will dominate. While ‘in-context learning’ (providing examples in the prompt) can significantly improve output quality, especially for structured tasks, models can also uncritically adopt information from the context, even if incorrect (sycophancy).
- Reproduction: Current LLMs do not learn continuously during interactions. Their training is a one-time, weeks-long process with massive datasets. This means they cannot make their own experiences or adapt to current societal changes in real-time, creating a time lag. They rely on human-generated texts for their reproduction, facing a ‘grounding problem’ where they lack a direct connection to reality and continuous learning.
- Lack of Social Perspectives: Unlike humans, AI cannot engage in symmetrical discourse to clarify truth or take different social perspectives. Its perspective is functional, lacking the normative component essential for human interaction and responsible assertions.
- World Model: While AI can generate videos (like OpenAI’s SORA) that simulate movements, it lacks an understanding of underlying physical laws, leading to inconsistencies. Simulation models in games offer interaction but lack the unpredictability of the real world. Current generative AI training algorithms cannot provide the continuous learning needed for a realistic world model, leading to incoherent outputs and difficulties with spatio-temporal assignments.
- Reasoning: Generative AI performs poorly in common-sense reasoning. This is attributed to a lack of a physical world model and continuous, embodied experiences. While techniques like Chain of Thought prompting aim to improve logical reasoning, even advanced models like ChatGPT-4o have shown significant gaps in targeted reasoning, often misapplying learned patterns to slightly altered problems.
Applications in Technology Assessment
Given these limitations, the paper advises using generative AI in TA cautiously and only when results can be thoroughly verified. It should primarily serve as an idea generator and support tool. For tasks like Horizon Scanning, which involves extensive information gathering and evaluation, AI tools can assist in summarizing sources, clustering information, and generating ideas. However, automatically created literature reviews, interview transcripts, or overviews must be checked for accuracy, completeness, and the presence of ‘hallucinations’ or invented sources.
Generative AI truly shines in the linguistic and graphical preparation of results: transforming bullet points into flowing text, creating slides and graphics, text correction, and translation support. Yet, even in these areas, human oversight and verification of all automatically generated content are indispensable.
Also Read:
- Beyond Performance: Redefining AI as a Form of Existence
- Unpacking AI’s Role in 3D Packing: LLMs as Heuristic Designers
Conclusion
In essence, generative AI in TA should be limited to supporting roles and generating ideas, with all outputs subject to rigorous human review. The authors caution against the ‘fear of missing out’ (FOMO) driven by the information deluge, arguing that more information doesn’t always lead to better understanding. Intensive reading, summarizing, and writing are crucial for human comprehension, skills that machines cannot replicate. It is vital for humans to maintain and develop these understanding-related abilities, rather than letting them atrophy due to convenience or time pressure. For more details, you can refer to the full research paper here.


