Non-Deterministic Outputs and Academic Source Standards
Scholarly Citations of Gen AI supported Research through Verification over Replicability. A Proposal of a Citation Oriented Sub-Model with automated DOI generated identifiers.
How come when we both go to chat.openai.com and I ask Chat GPT something and you ask Chat GPT something, we get different replies?
The answer is as complex as the fundamental differences between us as people. Yes, it’s true that the training data for each is the same. Yes, it’s also true that the algorithm is the same for both as well. The url, the interface, the time and date accessed, the model type and the prompts… they can all be exactly the same and yet the responses will be different. This is because AI outputs are non-deterministic meaning different users can receive different outputs for the same prompt. They are dynamic, context-sensitive, and personalized. Many of the most popular AI tools on the market retain user-specific memory, tailoring their replies over time to more efficiently help the user achieve their goal. This non-deterministic programming built into the infrastructure of the AI programs that we are using brings forth the dichotomy between an omniscient, sterile provider of knowledge and the distilled confirmation bias of a personalized assistant. The gulf between the perception of what AI can do and what it actually does, morphs its use from a beacon of unbiased truth to a conveyor belt of self aggrandizing agreement with all the dangerous trappings of the authority of its sci-fi backed black box of mystery.
The melange of fundamental technical misunderstanding, late and ineffective regulatory guidelines, early adoption by opportunistic or exploitative actors and doomsday fear mongering has created the perfect storm to push Gen AI use into the shadows of our more polite company. For traditionalists in academic fields for whom one’s reputation is as indispensable as it is fragile, the public knowledge that they may use Gen AI threatens their livelihood, their trust and their legacy. This has not, however, stopped everyone from university students to researchers to curators to librarians from utilizing it in private. The issue then gains further complexity, as a physiological and social veil of shame obscures AI’s ubiquitous implementation and further distances us from its ethical and transparent use.
With clear attribution, AI use would be transparent, distinguishable, credible and regulated. Without clear attribution conventions, it is up to the scholar to disclose its contributions, which allows for rampant abuse of the tool to go unchecked and undiscovered. In a recent speech at an Austrian University, I discussed a Clariah Funded project I am co-leading to draft citation conventions for novel digital sources including Gen AI. I asked a room filled with scholars if they have ever had any difficulties in properly citing support in their research from novel digital sources including AI. The room fell silent and in a jam packed seminarium, just one scholar was willing to raise her hand. The others dodged every attempt I made to uncover and bring to light the need for clarification of citation conventions for the project participants to explore in our upcoming workshops. I would stake my supper on the assertion that more than one person in that room had utilized Gen AI at some point in their research and was not able to properly cite it due to a mixture of a lack of standards and a hesitancy to admit or formally acknowledge its role as a tool in their research.
The issues with citation of Gen AI in scholarly works and academia harken back to our issue of two people entering the same prompts in the same model and getting different results. Two of the main pillars of the academic integrity of a source are verification and replicability. For static sources such as books or journal articles, verification is key, whereas for datasets, experiments and analysis, replicability is the accepted standard. Due to the non-deterministic nature of Gen AI, responses are unique to the user’s interaction history meaning that they cannot be replicated. So how, you may ask, have global citation conventions attempted to deal with this issue so far?
MLA: MLA does not recommend listing ChatGPT in the Works Cited unless it's directly quoted in the work. Instead, MLA suggests describing the use of AI in the text or as a footnote. An in text citation would look like the following: (OpenAI, ChatGPT). The bibliographic citation would read: OpenAI. ChatGPT, version, OpenAI, Date of conversation, URL (if applicable).
APA: APA states that AI-generated text should not be credited as an author. Instead, it instructs to cite it as a non-retrievable source like personal communication. APA suggests describing how AI was used in the methodology or footnotes rather than formal citations. An in text citation would look like the following: (OpenAI, 2025). The bibliographic citation would read: OpenAI. (Year). Title of AI model (Version) [Large language model]. Company. URL (if applicable).
Chicago: The Chicago Manual of Style states that AI is not considered a person in Chicago style, so it is cited like software. Chicago advises archiving AI responses in repositories like Zenodo or OSF for future reference. An in text citation would look like the following: (OpenAI 2025). The bibliographic citation would read: OpenAI, ChatGPT (GPT-4), response to “Insert prompt here,” OpenAI, Date, URL (if applicable).
Harvard: Harvard referencing treats AI as software rather than an author. AI responses should be saved and referenced in a permanent archive when possible. An in text citation would look like the following: (OpenAI, 2025). The bibliographic citation would read: OpenAI (Year) Title of AI model (Version) [Large language model]. Publisher. URL.
The differences here lay not just in the formatting of the individual conventions, but in the treatment of the tool itself. Is it considered an author, a non-retrievable source, or a software? Based on the treatment and classification, the methodology of how to save and/or reference the output is adjusted.
As a researcher currently working to propose standards for the implementation of citation conventions for novel digital sources in Austrian Institutions, I am intimately aware of the issue of irretrievability of digital sources for verification. Not only in the manner of the APA’s treatment due to LLM output being treated as personal communication but also with regards to dead urls, obsolescent software or corrupted files. The methodology developed to fight back against these issues in other formats is a unique resource indicator or a DOI.
DOI stands for a digital object identifier and is bestowed upon a resource by the DOI Foundation, a not for profit organisation that governs the DOI system. The DOI system complies with ISO 26324, which defines the syntax for a DOI name. With a reliable history, the standard was approved in November 2010, published in May 2012 and revised in 2022. There are approximately 745 DOI’s resolved each second and the number of DOIs resolved is nearly 100,000,000,000. An example of a DOI is: 10.1000/182. An example of a link to that DOI is https://doi.org/10.1000/182. This system was designed to be read by humans as well as machines and allows things to be uniquely identified and accessed reliably.
Given the incredible tool made available to the research community and scholarly world by Open AI in the form of Chat GPT and it’s many models, as well as the reliability of the DOI Foundation and its ability to generate DOI numbers for referencing, I propose a partnership under the philanthropic arm of Open AI for the sake of transparency for researchers and the betterment and efficiency of the academic community. With a sub-model of Chat GPT designed for scholars, entire prompting discussions and outputs can be stored and made easily referenceable using an auto generated DOI once the conversation with the chatbot has reached its conclusion. The conclusion will be indicated by a “complete” button option in GPT Scholar sub-mode whereby an auto DOI will be created and the entire transcript of the discussion will be stored and able to be accessed and searched. This will result in the ability of researchers and scholars to treat the contribution of Open AI as a static source rather than a dataset or experiment, due to its newfound ability to be verified rather than replicated. This will allow for scholars to confidently use all tools available in a transparent, traceable and ethical manner, propelling the sciences forward at an exponential rate while keeping AI use ethical, credible, regulated and verifiable.
The Proposed Model would treat Chat GPT: Scholar’s DOI linked prompting and output log as a static source rather than a software or dataset. An in text citation would look like the following: (OpenAI, DOI#). The bibliographic citation would read: OpenAI (Year) Title of AI model (Version) [Large language model]. Project Name. DOI Hyperlinked URL.