Accelerating life sciences research

46 points8 comments13 hours ago
icameron

So the same concept of an LLM training and inferring tokenized language, it’s doing tokenized aminos. Instead of artificial intelligence/language it’s doing artificial evolution/life I guess?

Legend2440

If I’m understanding this right:

1. They have a protein model similar to AlphaFold

2. A biotech startup used this model to engineer a protein that converts adult cells into stem cells, at a higher efficiency than existing techniques. (But still only a tiny fraction of cells convert)

Application to life extension seems speculative.

show comments
biophysboy

> We initialized it from a scaled-down version of GPT‑4o to take advantage of GPT models’ existing knowledge, then further trained it on a dataset composed mostly of protein sequences, along with biological text and tokenized 3D structure data, elements most protein language models omit.

> A large portion of the data was enriched to contain additional contextual information about the proteins in the form of textual descriptions, co-evolutionary homologous sequences, and groups of proteins that are known to interact.

These bits made me wonder what would have happened if they had only used the supplementary biological data with an untrained LLM model.

dingnuts

[flagged]

show comments
outside1234

[flagged]

show comments