Another Boundary Breached – Deepfaking the Human DNA

Updated: Apr 22

A new scientific discovery could be the solution to the ethical and privacy issues surrounding DNA research, but a host of new concerns may be just around the corner.

An illustration of DNA strings
Credit: madartzgraphics on Pixabay

Due to how powerfully our genes define us, working with genetic, or DNA data is a privacy and ethical minefield for those handling them. Researchers often have limited access to genetic databases, thus limiting the data and the sample sizes that they can study and restricting how they can use those data.


A team of researchers from Estonia, led by Burak Yelmen, a geneticist at the University of Tartu, may have found a way to work around this constraint.


Using generative adversarial networks (GANs), the team was able to generate artificial DNA to act as an indistinguishable substitute to actual human DNA. GANs are the same technology used to create deepfake imagery. This deep learning approach involves the use of one neural network to create data sets (in this case, lines of genetic code) for the model to learn from, and another neural network to validate them. This cycle is then repeated to train the model further and improve its accuracy. According to Luca Pagani, one of the geneticists in Yelmen’s team, the resulting genomes “are not distinguishable from other genomes from the biobank we used to train our algorithm, except for one detail: they do not belong to any gene donor.” With synthetic genomes, privacy concerns no longer apply, and access to this data can be expanded without worrying about ethical issues, providing researchers with a wealth of samples to work with.


Some of the possible applications of synthetic genomes include the creation of stem cells that are resistant to viruses and cancer, and the modification of animal genomes to make them more suitable for human transplant.


However, because there is still so much that is unknown about DNA, there may be some shortcomings in fully recreating the genome. Artificially generated genomes might not preserve some “functional motifs and domains” in human genomes. Fake DNA also differed from real DNA in the way they were assembled, with fake DNA being more frequently produced in short chunks versus actual human samples.


Aside from these limitations, there are also worries about how this ability to fake human genes could be used for less than upright purposes. Similar to the apprehensions that have already been raised about the other types of deepfakes, scammers could possibly use fake DNA to deceive and defraud.


"In the near term, it's going to get easier for bad actors to create fake personas that can stand up to even the most rigorous inspection. Not that we envision a scenario where a scam artist needs to provide a fake transcript of their genome, but the unknown unknowns are where security holes tend to grow the fastest," writes Tristan Greene in The Next Web.


As exciting and promising as this development is, some degree of wariness and care needs to be exercised. When dealing with something as fundamental to our physicality as DNA, ethical questions are bound to be raised.


Main Source:

Carey, Teresa. “This DNA Is Not Real: Why Scientists Are Deepfaking the Human Genome”, Freethink, https://www.freethink.com/articles/artificial-genomes.


Other Related Sources:

Greene, Tristan. “This Human Genome Does Not Exist: Researchers Taught an AI to Generate Fake DNA”, Neural | The Next Web, https://thenextweb.com/neural/2021/02/08/this-human-genome-does-not-exist-researchers-taught-an-ai-to-generate-fake-dna/.

Schultz, Isaac. “Artificial Human Genomes Could Help Overcome Research Privacy Concerns”, Gizmodo, https://gizmodo.com/artificial-human-genomes-could-help-overcome-research-p-1846225366.