Frequently Asked Questions¶
How do I get data into EDD?¶
The Experiment Data Depot (EDD) imports data in two steps. (Fig. 1)
Experiment Description input: this file describes your experiment design so EDD knows how to store all your data, and how it is related to your strains and samples (see below for more information).
Data input: different types of data can be added in several successive steps. These data input steps are independent of each other, facilitating the combination of different types of data (e.g. multiomics data sets).
What is an experiment description?¶
An experiment description file is an excel file that describes your experiment (Fig. 2): which strains you are using (part ID from ICE), how they are being cultured (lines and metadata), which samples are being taken (assays) and how they are processed (protocol). Look at Fig. 3 to see how EDD organizes your experimental data (i.e. the ontology).
The experiment description provides a single file standardized description of your experiment that is useful for, e.g., you to design your experiment, or the proteomics or metabolomics team to understand your experiment so they can plan how they will process your samples.
What is a line?¶
A “Line” in EDD is a distinct set of experimental conditions, (e.g. a single culture). A Line generally corresponds to the contents of a shake flask or well plate, though it could also be, e.g., a tube containing an arabidopsis seed or an ionic liquid for a given pretreament. A line is not a sample: several samples can be obtained from a single line at different times (see Fig. 3).
A typical experiment (Fig. 3) would take strains from a repository, culture them in different flasks (lines), apply a protocol at a given time (an assay), and obtain different measurement data. Protocols are kept under protocols.io to enable reproducibility and better communication. You can find the LBNL repository here.
How do I choose good line names?¶
A good way to name your lines involves the strain name, culture conditions and whichever other condition is being changed in the experiment. For example, WT-LB-70C would indicate is a wild type, grown on LB at 70º C (imagine you are trying different growth temperatures). Cineole-EZ-50C indicates a cineole producing strain, grown on EZ at 50º C … etc.
What are the column options for experiment description?¶
The primary line characteristics that you should have in every experiment description and every EDD service (instance) are:
- Line Name: a short name that uniquely identifies the line (REQUIRED).
- Line Description: A short human-readable description for the line (encouraged).
- Part ID: the unique ICE part number identifiers for the strains involved (encouraged).
- Replicate Count: the number of experimental replicates for this set of experimental conditions (encouraged).
Other metadata types (e.g. media, temperatures, culture volume, flask volume, shaking speed … etc) are also available, but depend on which EDD site you are using. Ask your EDD administrator for more information. Columns can be in any order.
TBD: include link to full metadata listing in any EDD.
Why should I use the Experiment Data Depot?¶
The Experiment Data Depot (EDD) is a standardized repository of experimental data. This is useful for the following reasons:
EDD provides a single point of storage for your experimental data, to be easily referenced. Instead of providing a collection of spreadsheets organized in an adhoc manner in the supplementary material of your paper, you can give a single URL where your readers can find all the data in a format that is always the same. This will make your papers more likely to be cited. In the same way that storing your strain information in the Inventory of Composable Elements (ICE) will make it easier to access and more likely to be cited.
Easily collate different types of multiomics data. Comparing the results of phenotyping a cell using transcriptomics, proteomics and metabolomics can be complicated. EDD facilitates this task with the use of a standard vocabulary for genes, proteins and metabolites, solving the problem of leveraging multiomics data.
EDD facilitates data analysis. By using a standard data format through EDD, you can leverage previously created Jupyter notebooks to easily do your calibrations and statistics (e.g. calcualte error bars).
Enable Advanced Learn techniques. EDD helps you interact with data scientists to use Machine Learning and Artificial Intelligence techniques to effectively guide metabolic engineering. Just give them the link of your study and you will save them the wrangling of spreadsheets that consumes 50-80% of their time.
Why can’t I see the data in the link?¶
You may not have the correct permissions to view the Study. Ask the person who sent you the link to give you read permissions.
What is a slug?¶
A slug is a way to identify a Study in links in a more easily readable form.
Using a slug allows for links to look like the below, with slug
Instead of using a link to the same study that looks like this: