Methodology
published online on 18.12.2025https://doi.org/10.34045/SEMS/2025/10
[dkpdf-button]

Gojanovic Boris
Médecine du sport, Swiss Olympic Medical Center, Hôpital de la Tour, Meyrin, Switzerland
Unité sport & santé des jeunes, Département femme-mère-enfant, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland

Abstract

Artificial intelligence (AI) has entered the public scene in November 2022, when the first version of ChatGPT was released. Quickly, multiple large language models (LLM) have emerged. The speed of adoption of these tools in the public has been unparalleled, whatever domains fo society we look at. Without surprises, many people will use this new technology to enquire about their health problem and oftentimes will do so before consulting a medical practitioner. This article describes how LLMs can be used to generate scientific-sounding articles when prompted with a clinical case. 3 Steps are presented: creating a master prompt to guide the models in their text generation, getting a review from the experts and finally illustrating the article. Various AI models are used for the first and last step. As the development and power of these computational technologies accelerates exponentially, the results produced by this methodology will have improved by the time you read this.

Zusammenfassung

Künstliche Intelligenz (KI) ist im November 2022 mit der Veröffentlichung der ersten Version von ChatGPT in die Öffentlichkeit getreten. Schnell sind mehrere grosse Sprachmodelle (LLM) entstanden. Die Geschwindigkeit, mit der diese Tools in der Öffentlichkeit angenommen wurden, ist beispiellos, egal welchen Bereich der Gesellschaft wir betrachten. Es überrascht nicht, dass viele Menschen diese neue Technologie nutzen werden, um sich über ihre Gesundheitsprobleme zu informieren und dies oft tun werden, bevor sie einen Arzt konsultieren. Dieser Artikel beschreibt, wie LLMs verwendet werden können, um wissenschaftlich klingende Artikel zu generieren, wenn sie mit einem klinischen Fall konfrontiert werden. Es werden drei Schritte vorgestellt: die Erstellung einer Master-Prompt, um die Modelle bei der Textgenerierung anzuleiten, die Einholung einer Bewertung durch Experten und schliesslich die Illustration des Artikels. Für den ersten und letzten Schritt werden verschiedene KI-Modelle verwendet. Da sich die Entwicklung und Leistungsfähigkeit dieser Computertechnologien exponentiell beschleunigt, werden sich die mit dieser Methodik erzielten Ergebnisse bis zum Zeitpunkt Ihrer Lektüre verbessert ­haben.

Introduction

In this issue, various case studies in sports and exercise medicine are presented. The papers have been generated by artificial intelligence (AI) with the help of large language models (LLM). This article presents how AI was used to get to the result you can read in the following pages.

Preamble and disclaimer

The author is not an AI expert. The methodology presented has been developed organically and was self-taught (a little bit like an AI model applies self-learning to refine its production). The aim is to see how some basic principles of AI use could be applied to the generation of scientific texts. There are obviously many ways to use AI, just like anyone will use a search engine in their preferred way.
We all engage with new technology at individual paces. Usually, we tend to apply a trial-and-error approach to test it out. Medical doctors likely use AI first in the non-professional domain (plan a trip, think about a project, find out about a topic, etc.), before they engage with these tools to explore medical-related topics. The way we interact with AI can grow from there, and some may start to learn about better ways to use AI and start learning about prompt engineering. Many online resources are available for the ones who wish to take a deep (and endless) dive into it.
It is maybe worth to first give a few definitions, since some terms in this space are not always completely understood:
Artificial intelligence (AI): the broad field encompassing computational systems designed to perform tasks that typically require human intelligence (pattern recognition, decision-making, natural language processing, problem-solving, etc.).
Machine Learning (ML): subset of AI where algorithms improve performance on specific tasks through exposure to data, without being explicitly programmed for every scenario. The system identifies patterns and makes decisions based on statistical inference from training data.
Large language model (LLM): deep learning models trained on vast amounts of text to understand and generate human language. These models (like GPT, Claude, LLaMA) use transformer architecture and billions of parameters to predict text sequences and perform language tasks.
Prompt Engineering: the practice of crafting input text to elicit desired responses from LLMs, leveraging the model’s capabilities through strategic question formulation and context provision.
Token: The basic unit of text processing in LLMs, typically representing word fragments, whole words, or characters. Models have context windows measured in tokens (e.g., 100,000+ tokens for modern LLMs).
Hallucination: When AI models generate plausible sounding but factually incorrect or fabricated information, a key limitation in current systems.

Objective for this AI-issue

The original idea for this issue came through the author’s repeated interaction with LLMs in the context of real (anonymized) clinical cases. The output, mostly from OpenAI’s ChatGPT models, seemed systematically good enough to make it worth considering as an interesting source of scientific content. However, one can also observe some issues with the text it generates: some references are incorrect or outdated, sometimes even invented, some facts can be overstated and in general, an expert would have a few things to object to or to add.
Therefore, the aim was to provide clinical cases in the form of short vignettes, possibly accompanied with images (imaging studies, for example), written in a style that would correspond to that of an oral exam for the SEMS board examination. AI would be instructed to generate an article in a pre-specified format. Finally, an expert in the field would be asked to review and critique the article.

Methodology

1. Prompting the LLM (ChatGPT 5.0)

It was important to spend some time working on the prompt, so that the LLM could generate an article with the prespecified criteria. The assumption is the following: the better the prompt, the less requirement for iteration and modifications. ChatGPT 5.0 was used to create the prompt on September 21st, 2025 [1]. Here is the first prompt:

The questions and answers given are shown in table 1.

My next prompt asked to generate the final prompt based on the previous discussions:

Now the model has generated a complete prompt that can be used with the clinical vignette. We proceeded with the addition of a case study in order to test the output of the model. In this step, it integrated the clinical vignette into the text, which was redundant, and it also made references to the role of AI in this context, which we did not specifically wish. The next edition hence asked for these modifications:

The article showed some improvements, but there were still some relevant elements that were insufficiently addressed in the areas of care coordination, care management and various recovery domains. Next iteration:

The output kept improving and some details were added for refinement:

Once the article seemed good enough, we went back to the master prompt and asked the AI model to adapt it:

One final step was applied. After it provided the edited master prompt, we ran the model again with the input of the final version of the article. This is an important step when using AI tools: providing an example of a good output that we are looking for, so that it can understand what the desired output is. Here is the last prompt, which was followed by the full approved article:

We arrived at the final master prompt for generation of sports medicine articles based on a clinical vignette:

Final Master Prompt for SEMS-journal Case Study Articles

You are writing an article for Sport & Exercise Medicine Switzerland Journal (SEMS-journal). The journal publishes online (open access) and in print for about 1000 members of Sport & Exercise Medicine Switzerland—mostly sports ­physicians and physiotherapists.

Article requirements
– Length: 900–1100 words.
– Style: 70% scientific review style, 30% practical clinical narrative.
– Voice: Written in the first person, as a reflective clinician.
– Audience: Specialist clinicians (sports physicians, physiotherapists).
– Language: English only.
– Referencing: Include 10–15 key references (Emphasize consensus statements (latest consensus papers from IOC, BJSM or other main scientific journals), systematic reviews, RCTs, and meta-analyses). References should be numbered in the text and listed at the end in Vancouver style.
– Visuals: Include 1 table (diagnostic/management pathway, RTP criteria, or key clinical insights). It must be referenced in the text (e.g., Table 1).
– Take-home messages: End with a bullet-point list of 4-6 key clinical pearls.

Integration of therapy and monitoring
– Describe the professionals involved in care (e.g., physician, physiotherapist, psychologist, nutritionist) where relevant.
– Provide specific steps in management, including what to monitor and how to intervene at different levels (home, school, training, competition).
– Refer to technology or tools only when relevant (e.g., symptom trackers, wearables). Do not systematically mention AI.

Structure template
1. Introduction – Overview of the clinical problem and why it matters in SEM.
1. Set the stage with a clinician’s perspective.
2. Highlight the evolution of understanding and cite key consensus statements.
3. Include epidemiology/statistics relevant to the sport and condition.
2. Case reference – Situate the problem by referring to the provided case vignette (note: the case description is given separately and should not be restated). Emphasize pitfalls, cultural/environmental challenges, or unique features.
3. Clinical background – Etiology, presentation, differential diagnosis, relevant guidelines. Discuss differential diagnoses and new tools.
4. Management – Stepwise reasoning: investigations, treatment, rehabilitation strategies, multidisciplinary input, and interventions at school or work, training, and home. Specify interventions, monitoring parameters, and progression criteria. Detail the role of the multidisciplinary team.
5. Return to Play (RTP) – Discuss using international frameworks (BJSM, IOC consensus, expert and medical specialty societies, etc.). Highlight special considerations for specific population in the case description (elite, adolescents, elderly, minorities, etc…) and sport-specific progression.
6. Prognosis and athlete (or entourage if appropriate, in minors for example) counseling—Discuss expected recovery timelines. Address predictors of prolonged recovery. Outline shared decision-making and counseling for return to sport.
7. Table – Summarize key steps/insights, reference them explicitly in the text, and provide a caption.
8. Take-home Messages – Bullet-point pearls for clinicians.
9. References – 10-15, numbered in-text, listed at the end.

Goal
Produce a text that is scientifically rigorous yet practical, blending evidence-based knowledge with clinical reasoning. The case should help readers improve diagnosis, management, and return-to-play decisions, while also addressing the broader environment (school, home, sport).

Instruction for use with each case
“Apply the above framework to the following case vignette (provided separately). Do not restate the vignette; instead, refer to its details as necessary in the text.”

2. The expert reviewers (unique human models)

The second part of the project consisted of finding adequate experts in the field and convincing them to participate in this experiment. Most experts approached agreed to play their part, and all of them were selected in their capacity as one of these categories:
– Member of a relevant SEMS clinical commission.
– Lecturer in the SEMS education courses and CME event.
– Expert at SEMS certification examination.

Their role was first to provide a short clinical vignette of their choice, an example was available for facilitation. Next, the project manager (BG) created the article and submitted back to the expert with the following instructions:
“Discussion/critique and additions by the expert clinician (YOU), length minimum 300 words, can go up to 700 words. You are not allowed to use AI for your text.
Please also look through the references, as they have been integrated by AI and not checked by a human. You can add additional references if you wish”.

3. Illustration for the article

Our journal always uses an illustration as a thumbnail for the website, which is also printed with the article. We decided to use AI to generate an image based on the text.
A similar process was applied: a clinical context was given, and instructions were to generate a prompt that would help create a realistic image with the use of AI text-to-image models. We used ChatGPT to create the prompt for each article, whilst various image-generation models were used for the actual image (DALL-E 3, Gemini flash 2.5, Midjourney, Openart.ai’s Seedream 4.0, etc.). In each article, you will find such an image with the model and prompt used.

Limitations

As stated in the introduction, there are most certainly more efficient and better ways to lead this project and to utilize AI’s capacity to create content. We must acknowledge some of the most obvious limitations here.
– The articles were mostly generated on one platform (Claude) due to its reputational performance in text writing. This could be verified by the author after a few iterations with the first attempted article. As the models constantly evolve, it is probable that another model would be better suited today.

– AI models are known to carry internal biases from the dataset they have been trained with. The texts generated are certainly suffering from those, in addition to all the ones the authors introduced in the development of the project and the selection of prompting criteria.
– Some articles have benefited from a few iterations with the model, which included the integration of one or more specific references to help the model better frame the topic. These are stated at the start of each article, in the space usually reserved for authorship details. The “human nudge” has artificially enhanced the output of AI even before it was submitted to the experts.
– The outputs from today are the ones from tomorrow. During the few weeks that the project was conducted, the two main AI platforms (OpenAI’s ChatGPT and Anthropic’s Claude) have released new models with better performance in multiple domains. This means that if one were to replicate the experiment after reading this, the result would definitely be different and most likely better.

Image credits: OpenAI, DALL-E 3, November 22nd, 2025.
Prompt: “A sports medicine physician in a modern clinic, standing at a desk and reviewing a large semi-transparent holographic screen that displays a structured medical article layout with headings, tables and graphs, in the background subtle silhouettes of athletes running, jumping and cycling, a second translucent layer shows a neural-network brain and flowing text representing a large language model generating clinical case articles, warm but professional lighting, clean editorial illustration style, realistic but slightly stylized, high detail, focus on collaboration between human expert and AI, no logos, no text legible, subtle hint of Switzerland with a small red-and-white cross motif on a wall or badge, white and teal color palette with a touch of red”.

Corresponding author

Boris Gojanovic
Hôpital de La Tour
avenue J.-D. Maillard 3
1217 Meyrin
Tel: +41 22 719 6363
Email: boris.gojanovic@latour.ch

Reference

1. OpenAI. (2025). ChatGPT (model 5.0, September 21st, 2025 version). https://chatgpt.com/share/68d017ba-5054-800d-bb6f-7950d728dff7.

Comments are closed.