My data checklist: or, questions I ask every time before a new project

“To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of.” - Ronald Fisher (the guy from the p-values!)

Time is limited. If you’re a data or a bioinformatics person then someone right now is thinking about wasting yours. Most project ideas are not ready for the light of day - people get excited about something trendy because they read an article somewhere, then they come to you and need you to make it happen. A good half of my professional career was wasted on pointless projects before I said ’no’ using the below questions. Hopefully they’ll help you.


Question 1: What is the story?

Somebody got excited, perhaps read an article about how technology X - AI, or pangenomics, or single cell RNA, or satellite imaging - will save us all, and then they come to you, can you run technology X on this dataset? You might say yes because yeah, technology X is great fun! So you do the work across 3 months, you have the results, and you send them back.
Chances are your collaborators don’t know what to do with these results! They thought there would be a story or a product falling out of your results automatically. But the results are just pieces of a story, it’s up to them to arrange the pieces into a story. With data and bioinformatics projects, they won’t have the skills or the background to do that, that’s why they came to you!
So you as the data person suddenly have two jobs: first you generate the results and then you interpret the results. However, you usually don’t have the background for that either. You’re not trained in, say, the peculiarities of migrating humpback whales, you only wrote the code to track them. Time is wasted as you flounder about trying to find a story. Eventually, your collaborators move on as they expected magic but got only tricks. I could tell you how many times I’ve run into this problem on the fingers on my hand but I’d first have to dip that hand into radioactive goop and grow like 40 fingers.

Question 2: Who is the customer?

Well you have your story now, but who will care? Or you have your product, but who will buy it? In academia there’s your audience and the target journals to consider. Is the proposed project about a niche phenomenon in an orphan crop? Or are you going to cure all the cancers? You only get a certain number of shots in your career so you have to make them count.
In the business world I’ve now seen countless AI-based startups, usually wrappers around ChatGPT, that have a very loosely defined customer-base, or none at all. Many startups focus on summarising scientific research, for example. I can’t think of any academic who will pay for that. Those guys don’t even pay for AWS! Or think of all these companies building flavors of vector databases. Who will pay for that? Or did they just get excited about a technical aspect and then built the thing, hoping that ‘when you build it, they will come’? Who are ’they’? I’ve yet to see those mythical beings, I never received Roddy Piper’s cool sunnies.

Question 3: What do I get from doing this work for you?

Once this is all done and we’re ready to tell the world, what will my or my team’s role be? Will we be delegated to the shadows? Will we have a chance to lead this or a sub-project? People evaluate on leadership shown, will I have a chance to show that leadership? Or will I be pushed to the sidelines, only one assistant amongst many, author number 24 in a list of 89?
Why would I do this work if it does not benefit my or my colleagues’ careers? We could be working on more beneficial projects, so why this project? In academic work tat means defining authorship order early on, in the normal world that can mean defining company-wide visibility early on.

Question 4: What does the whispernet say about you?

If I don’t know the person approaching me, I will go and ask colleagues who might know the person. Horrible people are rarely horrible to your face, they only later become horrible (or worse, sabotage your career long after you’ve stopped working together, without you even knowing - I’ve seen some horrible letters of recommendation clearly just for secret sabotage). The whispernet, the informal knowledge about people, is huge. A bad reputation travels fast and wide. Use that whispernet to your advantage. For your health and your team’s health the upsides of working with a horrible person never outweigh the drawbacks.

EDIT Question 5: When are we done?

(Thanks to Thomas Sandmann for this one!) I’ve been involved in countless projects that had no real ’ending’. I delivered the results, a few weeks later an email arrives saying, ‘hey could you quickly please run this small analysis? it would help greatly!’ And then you do that, and a few weeks later you get another similar email, and you deliver, and another request, and this repeats on forever. Nothing gets actually out, you’re just fiddling with results. But it feels nice to be needed!
I now ask what the measurable end-goals are, which results exactly do you need? Anything after that is a second project/collaboration that needs to be re-negotiated. As Paul Valéry once wrote, a work is never truly completed, only abandoned. I wish people would take that advice.


Not having an answer to all of these questions is a huge red-flag; not having one or two answers might not be too bad. Sometimes you run an analysis and a great story just falls out automatically (many such cases!). Sometimes a customer appears, or your company is acquired before you need to find customers. Perhaps the bad reputation comes not from bad behaviour, but from someone else inventing rumors. Sometimes you have some downtime and would just like to learn the technology involved. Sometimes you just want to get your foot into the door to start a collaboration, hoping that there will be another project after this one that is more impactful. Sometimes you just believe in the idea.

Working with people is not easy. We pretend to be logical in our decisions but it’s usually your gut-feeling making the final call, and that feeling learns from experience. But hopefully you can skip the worst parts of the experience if you use the above questions!