DNAddy

fossilesque@mander.xyz · 3 days ago

DNAddy

ptu@sopuli.xyz · 3 days ago

Interesting, could you enlighten what types if data is in those 100 columns? I’m aware of ATGC and thought it would be just one column, but maybe the rest are some that indicate intensity or activity. Or what sequence they are part of.

rockSlayer@lemmy.blahaj.zone · 3 days ago

Well it varies depending on what the file is meant for. Usually there’s columns like chromosome, variant position, reference nucleotide, observed nucleotide, type of variation, codon sequence, gene name, etc.

There’s also columns that result from various analyses. In the file I’ve been working on lately, there are columns such as variant impact, level of confidence, pathogenicity, clinical significance, etc.

The_v@lemmy.world · 3 days ago

That sounds like a marker file. It’s a bit different than a sequence file.

Molecular markers are linked to specific sequences in the DNA. These markers are generally close by or in the gene of interest. All the extra columns described its characteristics and results. Anyplace in the entire genome where there is one nucleotide difference (polymorphic) can be another marker. There’s millions of these and they add up to massive files.

A sequence file is basically just a long boring sequence of nucleotides and are not that large. Now some of the files you use to generate the sequence. Let’s just say they had to wait almost 20 years for computers to get fast enough to process those files in a reasonable time. Those make the marker files look like childs play.

rockSlayer@lemmy.blahaj.zone · 3 days ago

I’m not familiar with the name of the file I’m currently working with tbh. It’s used to create the annotation files for regenie analyses. It has every variant for every gene within the biobank. There’s far more than just missense; there are stop/start gain/loss, splice donor/acceptor, frameshifts, and ptv. It contains primateAI scores, spliceAI scores, cava data, clinvar data, and more.

ptu@sopuli.xyz · 3 days ago

Sweet, thanks for the reply. I didn’t expect to fully understand what they would contain but I got the idea.

There’s a Japanese artist Ryoji Ikeda who you might like, he has visualised DNA and all sorts of data. I like his data.gram exhibition’s style the most esthetically amusing and he has published some albums too.

https://www.taronasugallery.com/en/exhibitions/ryoji-ikeda「data-gram」/