"Between pomise and caution: Generative AI and qualitative research for sensitive topics"
Joe Sankey offers careful thinking on the promise and limits of generative AI in qualitative social research for potentially sensitive topics
The emergence and rapid evolution of generative AI, in particular large language models (LLMs) such as OpenAI ChatGPT and Anthropic Claude, has begun to have a huge impact on the field of social research. Much present social research scholarship regarding generative AI reckons with the implications of these tools both practically, in terms of the new opportunities they afford for the mass collection and analysis of data, and also philosophically, in the sense of raising critical questions about the role of the researcher, and the ethics of outsourcing meaning-making to machines.
Bail (2024) argues that generative AI can enhance the scale, speed, and scope of social science research, while cautioning against the risks of bias, opacity, and ethical ambiguity. Similarly, Mellon et al. (2024) demonstrate how LLMs can categorise open-text survey responses with near-human accuracy, even outperforming some supervised machine learning models in some scenarios. Their findings suggest that AI could significantly reduce the cost and labour of coding large-scale qualitative data, potentially encouraging wider use of open-ended questions in survey research. Morgan (2023) explores the use of ChatGPT to analyse focus group data, finding that while the model performs well in identifying descriptive themes, it struggles with more interpretive or abstract insights.
My reflections on how generative AI may be utilised to the benefit of this research project, and where this may or may not be appropriate, are informed by the fact that the research project I am proposing is a qualitative one which explores a highly sensitive topic. These reflections fall into three main categories, which will be explored in turn: firstly, the potential for generative AI tools to conduct interviews, and the ethical and trust-related implications for participants; secondly, the prospect of using generative AI to thematically code qualitative data; and finally, utilising generative AI tools during the development of the research project, particularly to pilot questions while drafting the interview topic guide.
Using generative AI tolls to conduct interviews
On the first point – while I intend to conduct interviews for this project myself, due to sensitivity of the topic that they will cover - Geiecke and Jaravel (2024) describe an open-source platform they have developed which enables researchers to run AI-led qualitative interviews, utilising LLMs. This platform enables scalable, non-directive interviews across diverse topics and evaluations of the tool demonstrated that it had utility in eliciting key factors in decision making, political views, views of the external world, and subjective mental states of respondents.
Similar tools have been developed by other researchers and organisations, which enable qualitative data to be collected and coded at a pace and on a scale not possible via conventional research methods. While this is an interesting development, it is not one without limitations, some of which mean that this would not be an appropriate tool to use for this research project.
Firstly, on a practical level, many current AI interview tools are hosted online and are text-based, which can create inclusion barriers not typically present in face-to-face interviews. Participants would require access to a device capable of reaching these platforms, and to be sufficiently literate to articulate responses appropriately. These challenges could be especially pronounced for this project given the socioeconomic factors in consideration. Additionally, these tools would not allow for additional data related to body language to be collected and risk answers being provided by AI ‘personas’.
Secondly, there are complications regarding trust and ethics in AI which, as mentioned, are particularly acute given the sensitive nature this research topic. Rindfleisch et al detail how adoption of AI technology is largely constrained by concerns about privacy protection, and that the invocation of trust in these systems is a crucial means to overcome this (2024, p.375). This is particularly acute given the ability of these tools to record and transmit information that is private and personal in nature (ibid.).
Linked to this is the consideration regarding ethics – ethical approval must be obtained prior to all primary social research projects, and utilising LLMs can present complications to this end. As Morgan (2023, p.8) explains, most AI programmes have yet to make information regarding the inputs to their training public so it may be the case that, once a dataset is uploaded to an LLM, it becomes open for use by that programme in further training. This could potentially put the privacy of participants at risk unless the data is sufficiently anonymised – something which might not be possible if it is collected by the LLM itself via an AI-led interview tool. Therefore, while these tools are sure to be of great benefit for some projects, they would not be appropriate for this particular project due to the sensitive nature of the topics covered.
Using generative AI to conduct thematic coding
On the second point, regarding the prospect of using generative AI to thematically code qualitative data – the rapid development of LLMs in recent years has raised questions about the possibility of automated qualitative data analysis, or even the prospect of a researcher ‘chatting’ with their data to understand its complexities, rather than going through the coding process (Morgan, 2023, p.1).
Naeem et al (2025) have developed a process that enables researchers to use generative AI to conduct thematic analysis of data in line with Braun and Clarke’s six steps of systematic thematic analysis. This is an intriguing development though perhaps one which requires some consideration as some scholars, such as Bail, have highlighted evidence that different LLMs can produce dramatically varied when attempting to code the same dataset (2024, p.7).
As with prospect of using an AI tool to conduct interviews, I will refrain from using generative AI to support the coding of data in this project for three main reasons: firstly, as noted earlier, there are ethical considerations regarding participant anonymity and recurring use of data once uploaded to an LLM; secondly, relying on AI for coding raises questions about my positionality and reflexivity as the researcher, particularly regarding how knowledge is co-constructed with a machine; finally, on a more practical level, I don’t think it will be necessary as I will be collecting data from a maximum of 20 interviews, which I will have more than enough time to conduct a thorough thematic analysis on in a conventional manner. Moreover, given the difficulty of accessing this population at scale, automated qualitative analysis is unlikely to be required for this type of research.
Trialling of topic guide with generative AI
The final point is one where I believe that the incorporation of generative AI could be of great benefit to the delivery of this research project – utilising AI tools to refine the questions that will be covered during the semi-structured interviews and ‘pre-test’ the interview protocol. Rindfleisch et al (2024, p.381) note how doing so can enhance the validity of qualitative data collection by providing and opportunity for the researcher to revise their materials and ensure that appropriate questions are asked.
Utilising AI to trial a topic guide prior to commencing fieldwork is a novel and interesting possibility, but one that would need to be tested first to ascertain whether it would be appropriate for this research project. Existing instances of this exercise being completed by asking LLMs to respond as a ‘persona’ tend to utilise roles which are generic, for instance a ‘20-year-old female college student’ or a ’50-year-old male labourer’, to meet broad market-research purposes (ibid, p.379). As the topics that I’m interested in - particularly regarding the nature of a participant’s CHD condition and the way in which they understand risk and behaviour in relation to this - are more specific, it may be that some LLMs don’t presently have the required data to provide useful guidance as to how prospective respondents may react to certain questions.
In summary, while I believe that there are many interesting and exciting developments from generative AI in the field of social research presently, I do not feel that many of these tools can add much value to the data collection and analysis for this project, mostly owing to sensitivities and ethics regarding the subject matter.
I do, however believe that there could be some merit in exploring the use of AI tools in the development of the topic guide for the depth interviews, and in assisting in the transcription of my data. Additionally, it is worth noting that while generative AI is not suitable for the core data collection and analysis in this project, its evolving capabilities may offer future opportunities for ethically sound, participant-centred research design.
*******************
Joe Sankey is an MSc Social Research student. As the Evaluation & Impact Lead (Health) at the British Heart Foundation, he leads research and evaluation on a range of cardiovascular health interventions and programmes. Joe’s research draws on qualitative and mixed-methods approaches to explore a range of topics, including the lived experiences of individuals affected by heart conditions, public attitudes to CPR and defibrillation, and the attitudes and behaviours of BHF volunteers.


