Friday, 17 February 2023

Do AIs Hallucinate Electric Sheep? Part 2: Creativity and the Right to Make Imperfect Things

I am a creative person. I write fiction and poetry for fun, I make (naïve) art and a lot of crafts, and it is a big part of how I view myself in the world. 

I am also a consumer of other people's creativity, whether that's books, graphic novels, tv, music, film, or even the random little things that people do in this world, like sticking googly eyes in unexpected places.

Googly eyes on the cover of the USB charging point in a bus

Being creative is not easy, but not in the way that you might expect. I am lucky in that I don't lack creative ideas at all, and I have enough experience to implement them, and also not worry so much when they don't work out as I expect. 

The thing that blocks my creative practice the most is judgement. In a world where we can hear professional musicians at the touch of a button, or see media made with vast amounts of talent and resources streamed into our devices any time we want, it can be really hard to look at my own efforts. When I'm comparing my amateur attempts at, for example, lino cutting with those of professional artists who have been doing it for many, many more hours than I have, then it's hard not to become discouraged and lose the will to keep trying, keep creating.

Hustle culture adds to this negative pressure on human creativity. In a world where everything has to be monetised, any time spent doing something badly for fun is considered a waste, and therefore shameful. 

Let me be clear - this world with our ability to see and hear amazing creations so quickly and easily is amazing, and I am very happy to be part of it. But I do feel that we need acknowledgement of the creative things that are raw, that are rough around the edges, that aren't perfect, and that we as humans are allowed to create such things.

I believe that humans are fundamentally creative. We want to make stuff, and in times of enforced idleness, we will make stuff. During the covid lockdowns, when people were furloughed, it would have been so easy to default to the oft held belief that people would just sit on their sofa watching tv all day. This didn't happen. Whether it was musicians recording acoustic albums in their bedrooms, or the fad for sourdough bread, people were making things. Even with all this free time for "self-indulgence", people still found, and made, purpose in their lives.

So, what about AI? What about AI generated art and ChatGPT writing poetry? 

I don't believe that it is possible to be creative in a vacumn - though of course there's no way of testing this. All human creativity is inspired by and builds on what comes before, whether that's how we make our clothes, or how we put paint on things. 

AI art generation does this on an industrial scale - it hoovers up vast amounts of training data, i.e. images of artworks, and then uses those artworks to generate images of its own. There are lots of issues with how the training data was collected - copying copyrighted artworks from the web is not ethically or legally sound, but I'm not going to get into that at this time. Some people claim that this is just how human inspiration works - we see things and then we spin ideas off those things to create new things. The difference is just the speed and volume that the AI can manage.

I am not a trained philosopher. I am not a professional creative. So I can only tell you my opinions and thoughts here, rather than being able to generalise more widely.

ChatGPT's ability to write a perfectly metred and rhymed sonnet on a topic you give it makes me feel like my efforts to write sonnets are somehow less. Yes, the AI outputs are trite and lacking in deeper meaning, but to be honest, so are some of my creations. I am not a visual artist as such, but if I were, the AI art bots would induce the same feelings in me. 

This makes me cross. AI and mechanisation was supposed to give us the tools to do boring stuff quickly and easily, so we could spend more time doing fun stuff like art. But it seems like it's the other way around - we're left with the boring drudgery while the AI is pushing out images and words at a rate that no human would ever be able to manage. What's worse - this AI art is quicker and cheaper than a human artist, so of course in a world where costs need to be cut to the bone to maximise profit, it's the human artist who's going to be ditched.

And then where will that leave us? Starved for non-AI generated media and new content? AI can't use art to explore what it means to be human, because it isn't human, and is only basing its output on statistical transformations of its training data, what has been done before. 

So, what am I arguing for? A world that allows and encourages us to be creative, and rewards us for our efforts, even if our painting is wonky, or our fiction is derivative. A world where we have the time to experience amazing art done by amazing people, and become inspired by it to make our own creations. A world where the effort needed to create good art and music and words is not invisible, so we can all see and know and appreciate just how much effort it takes to create something.

Where does AI fit in? Let it be a tool, a prompt generator to spark ideas. Let it be a starting point for inspiration, not the ending point of creative endeavour. Let it give us the inverse of ourselves, so that by seeing what it is, we can understand what we are. 

Most importantly, let us play. Let us be creative, and let us acknowledge our creativity, in whatever ways those manifest. Because that is a huge part of what makes us human.


Do AIs Hallucinate Electric Sheep? Part 1: Context

 AI seems to be everywhere nowadays, whether it's in the "creation" of new artworks, the generation of deepfakes, or even ChatGPT's ability to produce confident text in a variety of formats on any subject you'd care to mention.

So where does that leave us, the humans who's job/inclination is to be creative, and to bring different pieces of information to create something new, or provide a new insight into something that's already known?

Firstly, let's start with what ChatGPT (and other chatbot large language models) is not: it is not human. It is not an expert, and in many cases, it can be absolutely wrong on fundamental bits of knowledge. It is a model that takes the proximity of words to each other in a given corpus (for example, a load of crawled webpages, or Wikipedia) and it encodes those relationships as a set of numbers. When it's called on to answer a question, what it does is string the words together in a way that is determined by those numbers. It's a statistical process, that produces readable text in a user friendly way. 

Alright, it's an interesting computer science problem to work on, with some cool applications. But why are people collectively freaking out about it, now that it's freely open and available for anyone to use?

My answer to this is culture. We, as humans, are so used to accepting that "computer says x" is the right answer, because it's instilled to us from an early age in schools. Computers use maths, and maths always has a right answer and a wrong answer. Therefore, if computers do arithmetic perfectly (which they don't, but that's a digression), then the answers they give are always correct.

Combining this with a deterministic view of the world through school-taught science, and we can easily wind up thinking that computers can model the world around us to a level of precision that we don't need to question. "Computer says X" is always the correct answer.

Even computer scientists buy into this mode of thinking sometimes - as the rapidly growing field of AI and data science ethics can show you. Computers may not be biased in themselves, but they are very, very good at replicating and amplifying any biases in their datasets. And history is full of bias, there's no denying that.

For some AI models, there's also this well known issue with hallucination - Open AI acknowledge in their list of limitations of ChatGPT that "ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers." These answers, or hallucinations, have no basis in the data that the AI was trained with, but the chatbot can deliver them with the same certainty it delivers all the answers to questions, even going so far as to argue the validity of the hallucination while being challenged on it. Determining what answer is accurate, and what is a hallucination can be very difficult, especially for non-experts in the field of the question being asked. Which, to be fair, is likely to be the vast majority of users.

So, computers are not always right, combined with the tendency of them to be very convincing, means that people are worried about floods of misinformation, and the misuse of them in a wide range of contexts, from getting them to write school essays for you, to making excuses about why you're filing your taxes late, to telling you how to break into a house in rap form.

From a research integrity point of view, there have been documented examples of ChatGPT including references in academic answers where the references do not exist.  

All this is enough to have universities, academic publishers and knowledge repositories coming out with restrictions on the use of ChatGPT, and in some case outright bans. 

Where do we go from here? The chatbot is very firmly out of the bag now, and I am sure that the problems that have been identified by it are already being worked on, one way or another. But what does that mean for the future of research, and, more fundamentally, to the future of human creativity?

I don't know, but in my next post, I'm going to explore human creativity and what it means for us when an AI can easily do what we find difficult, but ultimately is fundamental to our sense of self as human beings.


Thursday, 16 February 2023

User stories for Research Practice Training - where to draw the boundaries?

 In an attempt to figure out what the core aspects of research practice training are, versus the domain specific ones, I had a think and wrote a series of user stories to try to tease out commonalities. These are not accurate or complete, but I’m hoping they’ll be a good start for conversation. The key question for each user story is:

What does this researcher need to know to do their research effectively, ethically, transparently and verifiably?

The assumption is that everyone already has their funding sorted.

Researcher in particle physics needs to know:

  • How to access the data they need to use
  • How to manage the data
  • How to visualise and analyse the data (Scientific computing, high performance computing)
  • The background and metadata of the data collection process
  • The current state of the art in their field
  • How to communicate their research results (conferences/publications)
  • Health and safety for working with experimental machinery
  •  …?

Researcher in social sciences working with asylum seekers

  • How to gain ethical approval for their work
  • How to formulate their work so that it causes no harm
  • How to manage and safely store their data, including dealing with the privacy and dignity of their contacts
  • How to keep themselves and their contacts safe (physically and psychologically)
  • How to communicate their research results (conferences/publications)
  • How to influence policy and engage with non-academics as stakeholders
  • How to deal with conscious and unconscious bias
  •  …?

Researcher in AI-driven drug design

  • How to access and understand the databases that feed into the system
  • How to troubleshoot and understand the system outputs
  • Health and safety in the lab
  • Data and code management
  • How to communicate their research results (conferences/publications)
  •  …?

Researcher in ancient history

  • How to access and cite primary sources
  • Archive access and handling of fragile artefacts
  • How to store and manage their data
  • How to communicate their research results (conferences/publications)
  • How to respect the artefact’s cultural background, bearing in mind it might have been taken from another culture during a period of colonialism
  • The context around the artefact, and its past interpretations, bearing in mind historical biases
  •  …?

Researcher in clinical trials

  • Effective clinical good practice
  • How to deal with conscious and unconscious bias
  • Ethical approvals
  • Double blind experimental design
  • Human/animal experiment good practice
  • How to communicate with trial participants/other non-academic stakeholders
  • How to communicate their research results (conferences/publications)
  • …?

Researcher in modern arts

  • How to access and use their resources
  • How to manage and keep records of their observations/practices
  • How to communicate their research results (conferences/publications/exhibitions)
  • Stakeholder engagement
  • Research ethics and integrity
  • …?

Common topics

Stage of research

Topics

Beginning

· Ethical approvals
· Research integrity
· Current state of the art in the field, including community standards
· Safe working practices (physical and psychological health)

Middle

·Accessing, managing, analysing and using data/artefacts/physical resources
· Documenting their workflows/processes/practices – research records
· Stakeholder management
· Peer review and how to do it

End

· Communicating research results (stakeholders, policy makers, general public)

 

Unpicking those topics a bit:

General

Domain specific

Research Integrity and ethics

Current state of the art

Data management

Safe working practices

Stakeholder management

Workflow/practice recording

Research misuse

 

Peer review

 

Results communication (journals, presentations)

 

Note that safe working practices are mostly covered under the general health and safety training that everyone should go through when working for an employer, so even though there are some aspects that are very domain specific (radiation training, safeguarding, psychological safety), I haven't really included them in my further thinking about research practice.

When drawing the boundaries about what is research practice (i.e. what we're wanting to train people on to help them do better research) and what are techniques/tools/practices commonly done by researcher, I tend to think of them in terms of "is it something that only a researcher would do as part of their work?" It's always going to be a fuzzy boundary, and somewhat artificial, but we need to draw the line of scope somewhere. So that's why I'm not really thinking about health and safety, or project management, as core research practice topics, at this point in time anyway.

Random notes at the end:

Thursday, 2 February 2023

"Good Research Practice" – what does that mean?

 I’ve been thinking a lot about Research Practice in the past months, from a variety of perspectives. Of course, there’s a lot been said and written about it, and I’ve been doing a lot of listening and reading too. But to synthesise all I’ve learned over my career, with all the new things I’ve learned recently, I find myself in a bit of a muddle.

In these sort of cases, I’ve found that going back to first principles can be really useful to try and ground my thinking. So, on that basis, what do we mean when we say something is “good research practice”? “Good” I think we can (hopefully) all agree on (or at least the definition of  "good" can be considered out of scope for the moment) – so that breaks down to what research practice is, because we need to know how to do it before we can do it (or is that a bit chicken and egg there?).

To Google! A search for “research practice” (in my geographical area at least) provides as the first result the UKRI policy on the governance of good research practice (GRP). This is an interesting policy document in that it clearly lays out the responsibilities of the various parties (funders, institutions, researcher) when it comes to research integrity and research misconduct, but it doesn’t say much about what good research practice is.

Now, to be fair, this may very well be because different research domains do research in radically different domains, so it’s easier to list what counts as research misconduct, than it is to say what research practice actually looks like. (The UKRI GRP policy has Appendix 2 devoted to defining what research misconduct is.)

The UKRI Good Research Resource Hub is another interesting site which gives guidance on important research topics including open research, research integrity, equality, diversity and inclusion, human research participants and many, many more. But it still doesn’t give you a recipe or definition of what research practice is.

A bit further down the search result we find [Schwab et.al., 2022], which does what it says in the title and provides “Ten simple rules for good research practice”. These are then broken up into three sections, according to the stage of the research, whether that’s planning, execution, or reporting.

These rules are (fig.1):

Planning     

  • Specify your research question
  • Write and register a study protocol
  • Justify your sample size
  • Write a data management plan
  • Reduce bias

Execution

  • Avoid questionable research practices
  • Be cautious with interpretations of statistical significance
  • Make your research open

Reporting

  • Report all findings
  • Follow reporting guidelines

 The authors then discuss these headings in more detail in the text.

So, that’s all sorted then, right?

Well, remember how I mentioned different domains earlier? Yes, it’s not quite as simple as that. The terms used in the Ten Rules above aren’t universal across all scientific research, let alone across all possible research (which includes the humanities and arts as well).

Let’s take a stab at generalising them, shall we?

Planning

  • Decide what hypothesis you want to test/decide what information you want to find
  • Decide your methodology, taking into account domain and community standards
  • Will the data you’re collecting be enough for you to confirm your results with an appropriate degree of certainty? (How about error margins and statistical significance?)
  •  Write your data management plan
  • Are there any sources of bias in your data or your methods that need to be compensated for?
    • Does your research have the potential to cause harm? If so, is it worth it? Can you mitigate the risks of that harm occurring?

Execution

  • Do your research according to high standards of research integrity
    • Be the best researcher with the highest integrity you can be
  •  Be cautious with your interpretations
    • Are there any other reasons why you might have got the results you did?
  • Make your research as open as possible

Reporting

  • Report all findings, and all the details of the research, the good, the bad and the ugly
    • Try to make your research and component parts (data, code, workflows, etc.) FAIR
  • Follow community standards and practices for reporting
    • If possible, try to make those standards more open

Obviously, these are just preliminary thoughts on a subject with a lot of complexity, but hopefully they’re enough to get the brain cells working on this topic. And as always, figuring out where something isn’t quite right can be really helpful for determining what really works the best, given the circumstances.

~~~

https://www.sanger.ac.uk/wp-content/uploads/Good-Research-Practice-Guidelines-v4-Feb-2021.pdf 

Schwab S, Janiaud P, Dayan M, Amrhein V, Panczak R, Palagi PM, et al. (2022) Ten simple rules for good research practice. PLoS Comput Biol 18(6): e1010139. https://doi.org/10.1371/journal.pcbi.1010139