Summer 2024 YSP Projects and Application Information

BMSIS provides opportunities for college students and recent grads to participate as Research Associates with our institute, providing opportunities to participate in basic research and to learn about science communication, ethics, policy, and more. Our program is conducted entirely online, so there is no need for travel, and interns can take part from any nation on the globe.

YSP Research Associates (RAs) conduct supervised research under direct supervision by one or more BMSIS scientists and colleagues. The RA may work on-site or remotely, depending on the needs of the project, mentor, and RA. Funding is available for some projects but not all (see the list below). Research Associate positions will last nominally three months, though some may last longer, especially those that are funded.

BMSIS Research Associates will write a written report of their work for the project. This report may be used in a variety of applications, including (but not limited to): undergraduate project/thesis, conference proceedings, peer-reviewed journals, magazine/newspaper articles, and writing samples for job applications. RAs will be expected to present the results of their work either internally (to an audience of BMSIS scientists and affiliates using virtual communication tools) or externally (to an audience at an academic conference, convention, or other meeting venue).

The Young Scientist Program includes required modules in science communication as well as ethics and society with guidance from project mentors and other research scientists at BMSIS. RAs also will attend monthly BMSIS seminars and will have opportunities to participate in a variety of seminars and meetings held by professional researchers, science communicators, and more.

Upon successful completion of the Young Scientist Program and required modules, Research Associates shall receive a certificate of completion. Alumni from the Young Scientist Program may also receive requests for follow-up program evaluation.

Applications for the Young Scientist Program will be accepted from 1 March through 15 April with limited available positions, so interested applicants are encouraged to apply or contact us for more information.

Eligibility Requirements

Currently seeking a degree at a 2-year, 4-year, or 5-year university or a community college (or the equivalent), or recently have completed an undergraduate degree and currently considering graduate school.

Please Note: we do not accept graduate students. Those who have completed credits towards Masters or Doctoral degrees are not eligible for the program. Graduate students are encouraged to instead apply to our Visiting Scholars Program).

For further questions on eligibility, please see the Frequently Asked Questions (FAQ) document.

Able to dedicate at least 5 hours per week for the duration of the program (time requirements may depend on the project)

Provide proof of eligibility to work in the country of the Young Scientist Program (note: this only applies to projects where the RA is working on-site. Applicants for the online program need only be capable of working within their country of residence)

May not be a current U.S. government employee or a civil servant

Also note: BMSIS cannot sponsor travel or work visas to the United States

For further inquiries, please see our FAQ document. The FAQ document will be updated as needed during the application window.

Important Dates for the 2024 YSP

1 March 2024 – Applications will be open by 08:00 Pacific Time on this date

• 15 April 2024 – Applications close (applications will be accepted until 20:00 Pacific Daylight Time on the 15th)

• 6 May 2024 – Decisions communicated to applicants beginning on this date (due to the large number of applications we receive, some notifications may take longer)

• 3 June 2024 – YSP Begins

• 30 August 2024 – YSP Ends

Application Requirements

Contact one or more BMSIS scientists expressing specific interests about listed projects (see list below) by sending inquiries to scientists at their email address listed in the table below. Please include a thoughtful message of introduction, but also be courteous of their time. We have some guidelines in the FAQ document as to how to best write your messages so as to be professional and polite.

Satisfy any eligibility requirements specified by the BMSIS YSP and the “Required Skills” section of the project to be considered.

Complete the online application form for the project(s). If you have questions about the application form, please read the FAQ document. (Note: The application form is no longer available. Please check back in 2025)

Have two letters of recommendation sent to For more information about the letters of recommendation, please read the FAQ document.

There is a $20 USD fee for applying to the program.

Projects Available for the 2024 BMSIS YSP:

Project Mentor(s)Project TitleDescriptionRequired SkillsSkills the Interns will Acquire
Afshin Khan, Steve Thorne (Copernican Bioscience Inc.), & Afshin Beheshti (BMSIS)"
Search for Circalunar Periodicity in Living OrganismsIt is well known that all living organisms express circadian rhythms, but less well known is that some species exhibit circalunar (~28 day) periodicity. For example, biologists have observed 28-day periodicity in gene expression within brain cells of an insect called the midge, and a tendency for metabolic processes in some species of coral and algae to cycle with lunar periodicity. While existing models of metabolic periodicity are built around light cycles, our group is searching for an additional dynamic coupling to metabolic processes driven by Earth’s spin and orbital motion – which includes a 28-day wobble around the Earth-moon barycenter. This project includes research tasks on two fronts. A portion of the students’ time will be directed towards compiling a more extensive list of organisms which exhibit circalunar periodicity, and a portion will involve the statistical analysis of metabolic data we have recently obtained containing two decades of study on fish fertility in search of periodicity.The ability to read and understand scientific peer-reviewed research is required for the first task, together with good organizational skills. Computer graphic skills are a plus. The second task requires strong analytical skills and excellence in statistical analysis. Applicants do not need to have both skills to assist in this project.Students will acquire skills building and searching databases and for converting data into graphical and technical formats expected when publishing scientific papers. At the conclusion of this project, if data is found worthy of publication, the student will be part of the publication.
Afshin Khan
Exploring dysregulation in immunometabolic pathways during spaceflightThe relationship between the functional decline of specific organ systems and the integrative physiology during spaceflight is an area of current investigation. This project aims to explore publicly available astronaut’s health data to analyze epigenetic and immunometabolic reprogramming that could regulate cellular responses linked to mitochondrial dysregulation, with the help of computational tools.Comprehensive understanding of molecular biology and biochemistry. Some experience with computational biology tools, or bioinformatics or online gaming. Prior knowledge of R would be a plus.Scientific literature review and communication. Computational biology analyses.
Aubery Zerkle
Communicating Science via TikTokThis YSP 2024 project seeks a scicomm-focused candidate interested in building a TikTok account for the BMS initiative Sciworthy. The candidate will receive training in science writing following the Sciworthy house style and write at least one article on their topic of choice for publication on the website. Building from this knowledge base, the candidate will produce a series of TikTok videos for their article and other Sciworthy articles, experimenting with different styles and formats. Their ultimate goal will be to work with the Editor-in-Chief to determine how best to translate the initiative’s focus on objectivity and methodology into an engaging TikTok brand. Let’s be creative!Strong motivation to communicate science, willingness to apply skills across scientific disciplines, experience with the TikTok platform, creativityScience writing and communication, video production, social media marketing
Cassandra M. Juran
Design of three dimensional microphysiologic systems for organoid cultureI am looking to design complex cell and organoid culture environments for automated culture on Gateway or ISS. This project will require CAD knowledge (we will use open source FreeCAD) and some fluidics modeling experience.CAD design, FEM modeling and fluidic modelingtissue chip and microfluidic organoid culture environment design and an understanding of mechanically active physiologically representative cell culture.
Ekaterina Khramtsova & Alexander Kaurov
Artificial Intelligence for Science, administered by Blue Marble Space, is a research-focused nonprofit project dedicated to understanding how young minds engage with the contemporary digital information landscape. By harnessing the power of AI (computer vision and large language models), we conduct empirical studies on a range of children's content, including STEM educational materials, with the objective of deducing what constitutes effective science communication. As part of the YSP, the project will focus on analyzing how a specific subject (e.g., astrobiology) is presented in a certain language or community that we can choose to be interesting for both the student and our project.Have an interest in science communication and education; introductory statistics, strong analytical skills, familiarity with Python or high motivation to learn programming and data wrangling. Familiarity with some aspects of applied AI is a big plus.Read, understand, and evaluate scientific literature describing media studies; develop a hypothesis, study design, and perform data analysis.
Hari Mogosanu, Sam Leske, Milky-Way.Kiwi, & NZAN
Cosmic Communication InternThe Cosmic Communication Intern will contribute to sharing astrobiology in New Zealand. They will support website updates and research to ensure content accuracy and relevance, aiming to inspire and educate a diverse audience about the relevance of astrobiology to New Zealand. They will assist in organising and promoting space-related educational events and workshops, enhancing public understanding and appreciation of astronomy. They will support compiling the latest news and events in astrobiology in New Zealand and promote them to our audience. They would also help upload the relevant astrobiology places in New Zealand on a map we wish to create.Ideally a student with science background who would like to also learn science communication on the job. Ideal qualifications would be any of the sciences underpinning astrobiology.

Strong Writing and Communication: Ability to produce clear, engaging content on complex topics.
Research Skills: Competence in conducting thorough research to ensure accuracy and depth in content.
Social Media Savvy: Knowledge of how to effectively use social media platforms for promotion and engagement.
Organizational Abilities: Skills in planning and coordinating events, managing schedules, and meeting deadlines.
Technical Proficiency: Basic understanding of website management and multimedia production tools.
Interest in Astrobiology and Astronomy: Passion and curiosity about space science and its relevance to New Zealand.
Interns will acquire valuable skills and experiences, such as effective science communication techniques, in-depth knowledge of astrobiology and its significance in New Zealand, event planning and management skills, and proficiency in digital content creation and promotion. They'll also gain experience in research methodologies and the application of social media strategies for educational outreach. This internship offers a unique opportunity to contribute to a growing field, enhancing both personal and professional development in science communication.
Jacob Haqq-Misra
Terrestrial Analogs for Sovereignty on MarsSeveral government and commercial space agencies are already starting to develop human missions to Mars, which includes plans for long-term settlement. This project will draw upon historical and contemporary examples of shared sovereignty or sovereignty conflicts on Earth as a basis for exploring possible models or limitations to future governance on Mars. This work will include individual and group-based components, with each group member focusing on a different case study.Good writing skills, strong reading comprehension, online and library research abilities, self-directed learner.Candidates will learn to synthesize scholarly sources and archival data for addressing transdisciplinary problems.
Jim Cleaves, Anirudh Prabhu (Carnegie Institution of Washington), & Huan Chen (National High Magnetic Field Laboratory)
Classification of Carbonaceous Meteorites Using Machine LearningCarbonaceous chondrites (CCs) are organic-rich meteorites which preserve evidence of early Solar System abiotic organic chemistry that might have helped kick-start life. There are thousands of collected exemplars of these meteorites, and they are usually classified according to their mineralogical properties, which attempt to estimate the degree to which they experienced aqueous alteration and high temperatures. CC organics can likely also serve as indicators of these processes. We will attempt to correlate Fourier Transform Ion Cyclotron Resonance Mass Spectrometry data of bulk extracted CC organics with the mineralogical classification system, and attempt to describe the organic contents of novel CCs.Python and C++ programming skillsKnowledge of early Solar System organic chemistry, familiarity with high resolution mass spectrometry data manipulation and analysis, learning how to use machine learning and other informatic classification tools to make sense of complex data
Kirt Robinson
Building a comprehensive kinetic model for amino acid reaction networks across geochemical parameter spaceAmino acids are high priority target analytes in astrobiology for multiple reasons. Their presence and relative abundances could be interpreted as biosignatures. They can be used as sources of energy and nutrients to sustain microorganisms. And they have been theorized as important ingredients for prebiotic chemistry leading to the emergence of life.

However, despite hundreds of experimental studies investigating the abiotic reactions of amino acids, we are still poorly equipped to make quantitative predictions of rates for amino acid decomposition versus amino acid polymerization to more complex peptides across extraterrestrial environments. This lack of predictive capability is in part due to the isolated nature of such studies, which are generally aimed at simulating a particular environment rather than building kinetic models that can be extrapolated across geochemical parameter space (and therefore be applied to a variety of extraterrestrial environments).

In this project, we will compile rate data from the most rigorous amino acid experiments and develop geochemical parameter-dependent kinetics that can be extrapolated to different environments by comparing experimental conditions systematically among otherwise isolated studies. We will apply these newly derived kinetics to potentially habitable extraterrestrial environments to the extent possible to predict whether amino acids persist as food sources, assess background levels of abiotic amino acids within the context of interpreting potential biosignatures, and characterize which environments most favor the formation of peptides from amino acid polymerization. This analysis will also help us to identify whether crucial gaps exist in the experimental data, and if so, we will prescribe the conditions at which experiments should be conducted so that robust and environmentally transferable kinetic models can be produced. These prescriptions will fed directly into the consideration of experimentals using the high-throughput, automated, chemical reactor at Arizona State University known as the GeoChemPuter, who seek to explore the generation of organic complexity across geochemical parameter space.
Some familiarity with chemical kinetics, some familiarity with spreadsheet software (e.g., Excel), some experience coding (or strong desire to learn coding), ability to read scientific literature (or strong desire to learn how to read scientific literature)Understanding how to apply chemical kinetics to natural systems and the implications for biosignature detection and perhaps even prebiotic chemistry. Learning kinetic modeling software and building your own from the bottom up!
Lauren Seyler
Characterizing microbial communities in a serpentinite-hosted aquifer in the Coast Range OphioliteThe Coast Range Ophiolite is a 155 million- to 170 million-year-old section of ocean crust that was tectonically uplifted and emplaced in Northern California. Trapped Cretaceous seawater has reacted with the ultramafic rock, transforming it into serpentinite and producing hydrogen gas, small organic compounds including methane and formate, and a highly alkaline environment. Similar reactions are likely occurring or have occurred in the subsurface of Mars and beneath the oceans of the icy moons of Jupiter and Saturn. Our lab is seeking a student to assist in analyzing metagenomic and metabolomic data acquired from microbial communities living in groundwater deep in the ophiolite, to characterize microbial metabolic pathways in this challenging environment.The candidate should have a background in biology, including an understanding of genetics. Coursework or experience in microbiology is helpful but not essential. The candidate should be comfortable using Microsoft Excel. Familiarity with R is also helpful but not required.The successful candidate will cultivate skills in microbial ecology, biogeochemistry, data analysis, bioinformatics, scientific communication, and programming in R.
Lev Horodyskyj & Tara Lennon (Arizona State University)
Greenworks: Global Tropical Stingless Bee NetworkIn this project, the student will be helping develop the Greenworks global environmental stewardship network, focusing specifically on the Beeworks project being developed in Brazil. Work may include networking with community groups working on native stingless bee education and research throughout South America and development of online exchange platforms.React (for platform development), Portuguese and/or Spanish (for networking)Team work, developing community relationships
Lev Horodyskyj, Jonathan Oribello (Science Voices), Alex Gazdac (Science Voices)
Agavi: Off-Grid Digital Science EducationIn this project, the student will be working with the Agavi platform we are developing for experiential smartphone science learning in off-grid regions. Work may include helping develop new features for the platform, analytical techniques for parsing and gaining insights from platform data about student learning, or helping with off-grid deployments in Brazil and other locations.React (for development work), fluency in local language (for deployment work)Team work, big data analysis, developing community relationships
Lev Horodyskyj
Sustainable States: Role-Playing Games for Environmental EducationIn this project, the student will be working with the Sustainable States role-playing game that we use for environmental game-based learning in political science and Earth science classes in the US and Brazil. Work may include assistance with development and play-testing and/or helping develop research surveys for and analyzing play session data from a July deployment.Unity (for development), spreadsheets and Google suite (for research data)Team work, survey research methodology, data analysis techniques, scientific writing
Mark Claire & Aubrey Zerkle
Global Multiple Sulfur Isotope DatabaseMultiple Sulfur isotopes (MSI) measurements are the best proxy for Earth's evolving atmospheric chemistry, and record particularly fascinating information about the first half of Earth's history before microbes oxygenated the atmospheres and oceans.

I've made a database of every MSI measurement from the sedimentary rock record. The project involves reading the geological literature to identify the sedimentary environment the samples were deposited in. My hypothesis is that MSI signals are different in deep marine, shallow marine, and lagoonal/sabhka environments, but to test this we need the information!

I already have sedimentary information for about 60% of the samples, but would love your help in finishing this up.
This project is best suited for an geology major who has had a course in sedimentology. Being at a uni who has decent access to scientific journals would also help.You will become familiar with how to quickly scan scientific papers for specific information and become adept at chasing down references! I will teach you whatever you want to know about mass-independent sulfur isotope fracationation, as well as working with python and github
Rafael Loureiro, Luke Concollato, Sam Humprey, & Chad Vanden Bosch,,,
The Space Agriculture Laboratory Analysis Database (SALAD)The Space Agriculture Laboratory Analysis Database (SALAD) Project is looking for research assistants to help search the scientific literature for all published and unpublished work related to plant research for space applications. Assistants who join this project will have an opportunity to choose a certain subset of “plants in space” research to specialize in, and contribute summaries of these papers to the database we are building. SALAD will be a free, searchable database online for researchers and space entrepreneurs to use to learn the state of knowledge on space agriculture to inform experiments and technology developmentFluent in the English language; strong reading comprehension for technical papers on plant biology; coding experience (Python) *A pre-acceptance assessment will be conducted with each finalist on their skill levels in each one of the categories listed above. The SALAD team reserves the right to dismiss any candidate based on their assessment scores. Being pre-accepted is not a guarantee for any candidate to participate in the project.Utilize appropriate research methods and techniques to analyze and summarize research papers; Understand the interactions between plant omics and plant phenotype data; Compare and contrast different approaches and methodologies used in space agriculture research; Contribute to the development of a valuable resource for researchers and space entrepreneurs in the field of space agriculture.
Sanjoy Som, Serhat Sevgen, & Dana Jaimes
Magnesium sulfate brines and their applicability to MarsThis YSP 2024 project seeks a candidate interested in water chemistry as applicable to Mars. In detail, the candidate will learn about brines, especially magnesium sulfate brines, an area of expanded interest for the Som research group (composed of Dr. Som and former YSPs, now Visiting Scholars with BMSIS). The candidate will perform a literature search and read (and read) about how these brines are formed, in what environments they can be found, and how they apply to Mars. Following this new knowledge, the candidate will learn how to apply the EQ3/6 geochemical code to those brines and simulate their evaporation on their computer to predict the sequence of minerals formed.
Familiarity with Earth Sciences or intense curiosity to learn, introductory chemistry at the university level, familiarity with the Linux terminal. A desire for a career in the Earth and Planetary sciences is a plus.
Brine and evaporation geochemistry, familiarity with the Linux terminal, the EQ3/6 geochemical code, Mars surface properties, oral presentation of scientific material.
Graham Lau
Communicating Topics in Earth and Space ScienceScience communicators stand on the front-line of community engagement and the public understanding of science. Making science accessible for everyone requires developed skills in communication as well as an understanding of human nature. The Research Associates who work on this project will develop these skills while developing materials for sharing science. Accepted individuals will receive training in the use of social media for science communication and will have the opportunity to explore their own science communication skills by developing a project to communicate a topic in Earth and space science through writing, artistic media, music, video, social media campaigning, or another outlet that is most fitting.Good writing skills are necessary but will also be developed during the project. The ability to read and understand scientific peer-reviewed research is required. Applicants do not need to have their own social media accounts.The successful participants in this project will learn how to communicate more effectively and will gain skills in writing, speaking, and sharing science.
Celia Blanco
Computational investigation of navigability in fitness landscapesFitness landscapes, a fundamental concept in evolutionary biology, depict the relationship between genetic variations and their fitness in a given environment. Exploring fitness landscapes provides crucial insights into the origin of life and the emergence of molecular innovations, highlighting how the earliest genetic sequences may have overcome evolutionary hurdles to drive the complexity and diversity seen in life today. This YSP 2024 project aims to assess the navigability of computationally generated fitness landscapes. The candidate will construct and simulate landscapes of varying ruggedness to determine whether it's possible to move between different points without crossing valleys of low fitness. The work will involve programming and path-finding algorithms.Proficiency in Python is required and familiarity with pathfinding algorithms is recommended. Candidates must have a strong interest in molecular evolution and statistical modeling.Programming, molecular evolution, science communication.
Celia Blanco & Ricardo Cabrera
Revisiting the Chronological Consensus of Amino Acids in Protein EvolutionThe chronological order in which amino acids were incorporated into the genetic code provides crucial insights into the origins of life and the evolution of proteins. This YSP 2024 project aims to re-evaluate the current consensus on the appearance of amino acids, considering the most recent advancements in molecular biology and biosynthetic pathways, prebiotic chemistry and the organic composition of planetary bodies. The candidate will conduct a comprehensive literature review to identify new findings and methodologies that have emerged, potentially modifying the original consensus. The project also involves the use of basic statistical methods to update the consensus based on these findings.Candidates must have the ability to read scientific literature and a strong interest in evolutionary and molecular biology.Scientific literature review. Data analysis. Science communication.
Liam M. Longo
Studying Proteins at the Mineral SurfaceMineral surfaces served as primitive catalysts that may have jump-started metabolism. Modern metabolism, however, is executed (with few exceptions) by enzymes. In this project, we will review the literature on protein-mineral catalysis as we develop the idea of a "geozyme": a peptide associated at the mineral surface that augmented primitive catalysis. Special attention will be paid to instances of potential evolutionary continuity between mineral-adsorbed and independently folding evolutionary states.basic biochemistry, basic programming proficiencycritical reading of the literature, basic mineralogy, basic structural biology
Haritina Mogosanu
Analog places for astrobiology in RomaniaAstrobiology is not well known in Romania, and it would stay as an elitist knowledge without being promoted as an education activity, and something within reach for the students there. Our project aims to create a map of astrobiologically significant places in Romania. Romania's diverse landscapes offer unique insights for astrobiology, particularly in understanding life in extreme conditions. The Movile Cave, isolated for millions of years, hosts a chemosynthesis-based ecosystem, providing a glimpse into subsurface life on other planets. The Berca Mud Volcanoes and Scărișoara Ice Cave reveal how organisms adapt to harsh environments, mirroring conditions on Mars or icy moons. Additionally, Romania's salt mines and thermal water caves, like Râșnov, showcase life thriving in high-salinity and thermal niches. These sites collectively underscore Romania's significance in studying life's resilience and potential beyond Earth.Scientific Knowledge: A foundation in astrobiology, geology, biology, or environmental science to understand the processes and significance of the sites.
Fieldwork Skills: Experience or willingness to learn how to conduct fieldwork in potentially challenging environments, including cave systems and remote locations.
Critical Thinking and Problem-Solving: Ability to think critically and solve problems that may arise during research or mapping projects.
Communication Skills: Written and verbal communication skills for documenting research findings, creating reports, and communicating with a diverse team.
Collaboration and Teamwork: Ability to work effectively in interdisciplinary teams, often with experts from different fields.
Adaptability and Resilience: Flexibility to adapt to changing conditions and persistence in overcoming research challenges.
Attention to Detail: Precision in recording observations, data collection, and mapping.
Project Management: Skills in managing projects, including planning, executing, and meeting deadlines.
Enhanced Technical Proficiency: Advanced GIS skills, including creating detailed maps and performing spatial analyses. Improved data analysis capabilities, using statistical and computational tools for interpreting complex datasets.

Fieldwork Expertise: Practical experience in conducting field research, including data collection methods specific to extreme environments. Enhanced understanding of how to apply remote sensing techniques in real-world scenarios.

Research and Scientific Insight: Deepened knowledge of astrobiology, geology, and environmental science, particularly in the context of extreme environments.Experience in formulating research questions, designing experiments, and hypothesis testing.

Critical Thinking and Problem-Solving: Development of problem-solving skills through overcoming fieldwork challenges and data interpretation issues. Enhanced ability to critically evaluate scientific data and research findings.

Project Management and Organization: Experience in planning, executing, and finalizing projects within a given timeframe. Skills in managing resources, including time and equipment, efficiently.

Communication and Collaboration: Improved ability to communicate scientific findings both verbally and in writing, tailored to varied audiences. Experience working in interdisciplinary teams, enhancing collaboration and teamwork skills.

Professional Development: Insight into the workflow of scientific research projects and the roles within a research team.Networking opportunities with professionals in the field, potentially leading to future research collaborations or employment.

Ethical and Environmental Awareness: Understanding of the ethical considerations in conducting environmental and biological research. Awareness of conservation issues related to sensitive and unique ecosystems.ensing technologies and their application in geological and biological studies.
Mark Neyrinck
Do intergalactic filaments help a galaxy grow?A lot of galaxy-formation modeling is based on an analytic "spherical-collapse model," which assumes that a galaxy forms from a spherically symmetric blob of matter. It is known that a blob with some ellipticity collapses to form a "galaxy" faster than a spherical one, but it is likely that the optimal shape from which a galaxy forms is more complex than that; often, galaxies are seen in the universe to have about three filaments sticking out of them. This project entails searching the set of possible 2D shapes for the fastest collapse, and simulating these collapses with a (computationally fast) 2D gravitational computer code. Does the universe tend to organize its matter to assemble galaxies (and then stars, and planets ...) as fast as possible?Some basic data analysis and programming skills (ideally, in python) are essential. Communication (visualization and writing) skills are important too, since the ultimate concrete goal of the project is to publish an interesting paper about it.The required skills will certainly be further developed, as well as rigorous scientific reasoning skills.
Jim Cleaves, Celia Blanco, McCullen Sandora
Decoding systems complexity through natural tokenizationA token is a semantic representation or avatar of a lower level of system organization. For example, a dollar represents a way to exchange goods and services based on their difficulty of acquisition. Tokens serve as simplified representations of complex systems, allowing for their hierarchical organization into components that can evolve more freely at higher levels of selection. As systems evolve, they may naturally “chunk” or “tokenize” themselves, facilitating more efficient evolution within the constraints of their environments. Examples of this process include the codon mapping of the genetic code, the selection of a canonical set of coded amino acids, the selection of a written alphabet’s characters, the grouping of syllables into words, and runs of amino acids that represent protein domains, among others. Though much is known about the domain level (100 amino acids) and larger, the structure of proteins/genomes below this level has remained relatively unexplored, in part due to the ill-posed nature of the problem. Here, we leverage recent advances in natural language process to find an optimum tokenization procedure that results in maximum information compressibility.

Fitness landscapes are conceptualizations of how evolvable systems can explore potential future states they can mutate into, and the relative “fitness” of those states. These landscapes help visualize the relationship between possible “genotypes” (such as nucleic acid or protein sequences, or higher-order variables) and their “evolutionary favorability” or “fitness” (the likelihood of one genotype to survive compared to others). The fitness level is metaphorically described as the "height" of the landscape, where higher points indicate greater fitness. Genotypes that are similar to one another are said to be "close" on the landscape, while those significantly different are "far" from each other. The sequence space of possible genotypes, their degree of similarity, and their fitness values defines the fitness landscape. This framework helps explain the existence of suboptimal forms in evolution and how evolutionary tinkering can lead to more optimal forms.

In this project, we aim to use tokenization compression software to explore whether natural tokenization can be detected in biological systems and language, and to understand how such tokenization arises in biological systems. We also address such basic questions as whether the tokens we find and their prevalences are conserved across domains of life, and through coding versus noncoding parts of the genome.
Experience programming in PythonFamiliarity with tokenization algorithms and mathematical concepts in evolutionary biology
Jim Cleaves, Stuart Bartlett, Anirudh Prabhu
Uncovering universal signatures of life with complexity theory: distinguishing physical, biological and technological processesThis project will answer a fundamental question: How can we objectively distinguish signs of life from non-life? Most scientists believe that life exists beyond Earth, yet we haven’t been able to prove it. This is because we don’t yet know how to recognize alien life. Our work will advance the search beyond life as we know it, beyond the familiar, and beyond our imagination, to find elementary features in biological signals that are absent in abiotic signals. Rather than focus on one metric or concept, we will apply state-of-the-art metrics (Shannon entropy, differential entropy, Lempel Ziv compressibility, Huffman encoding, algorithmic complexity, statistical complexity, and neural network complexity) from several relevant disciplines (complexity science, artificial intelligence, information and network theory). We hypothesize that plotting values of the measured complexity metrics for all datasets may reveal clustering across the class divisions of non-living, living and technological data. We will apply minimally biased methods to a broad range of datasets: time series of physical signals such as gravitational waves, biological signals such as animal communication (e.g. whale and bird songs, human speech and music), and technological data such as wifi signals. While the above data take the form of time series, we will also consider additional types, such as mass spectrometry data and transmission spectra from planetary atmospheres.Experience programming in Python and/or C++Familiarity with complexity metrics and information entropy measurement

You can submit your application by using the application form (note: the application is no longer available for 2024). Please review all of our advice on how to successfully apply in the FAQ document that we’ve linked in several places on this webpage.