We're hiring! Come join the team and scale new heights with us! ποΈ
arcinstitute.org/jobs
We're hiring! Come join the team and scale new heights with us! ποΈ
arcinstitute.org/jobs
scBaseCamp is released as part of the Arc Virtual Cell Atlas!
Great work by Nick Youngblut, Chris Carpenter, Alex Dobin, Dave Burke, @genophoria.bsky.social and team
π’Announcement: arcinstitute.org/news/news/ar...
πData access: github.com/ArcInstitute...
πReport: arcinstitute.org/manuscripts/...
Uniform processing lowers technical variation between scBaseCamp datasets.
Technical factors such as library chemistry and suspension type (single-cell vs single-nucleus) exhibited comparable or lower silhouette scores than biologically meaningful categories like tissue type
scBaseCamp is the first large biological data repository curated by an AI agent
We built a hierarchical agentic workflow (SRAgent) to automate discovery, metadata extraction & data processing
It is consistent, easily scalable and automatically updates when new data is available
scBaseCamp was built by directly mining all publicly accessible 10X Genomics scRNAseq data from the Sequence Read Archive (SRA)
With over 230M cells drawn from 21 species and 72 tissues, scBaseCamp is significantly larger and more diverse than existing single-cell data repositories
At the @arcinstitute.org we are building AI models of cell state from the ground up, rethinking every step, from data generation to biologically relevant evaluation
Today we launch scBaseCamp, the largest public repository of single cell RNAseq data, uniformly processed from raw sequencing reads.
Why do you want to switch
very easy to do this in Pycharm