Search::CropPAL

Subcellular Location of Plant-Derived Proteins

Global agriculture faces increasing demand for crop yield, higher amounts of and more diverse plant-derived protein. Protein subcellular location is a key element in determining protein function and accumulation patterns in plants and is critical for better harnessing plant energy for yield and plant defence for sustainability. The cropPAL2020 dataset provides a comprehensive subcellular proteomics resource and user interface for exploring global protein distributions within crop cells. It identifies species-specific protein subcellular location divergence and defines the best species for comparisons to drive compartmentation-based approaches to improve yield, protein composition and resilience in future crop varieties.

Subcellular location can be determined by fluorescent protein tagging or mass spectrometry detection in subcellular purifications as well as by prediction using protein sequence features. The compendium of crop Proteins with Annotated Locations (cropPAL) collates >800 studies performed by > 700 scientists in 45 countries around the world and computational data from 12 prediction algorithms. Crops included are banana (Musa acuminata), barley (Hordeum vulgare), canola (Brassica napus), field mustard (Brassica rapa), maize (Zea mays), potato (Solanum tuberosum), rice (Oryza sativa), sorghum (Sorghum bicolor), soybean (Glycine max), tomato (Solanum lycopersicum), wheat (Triticum aestivum), wine grape (Vitis Vinifera). The data collection including metadata for proteins and studies can be searched using the query builder below. The Homology tab functions allows the search for location data across all crop species as well as compares it to Arabidopsis data from SUBA4.

Find this resource useful? Please cite cropPAL (PubMed, Plant Cell Physiol).
Bulk downloads available at RDA
Previous verions of cropPAL: cropPAL1 (2015), cropPAL2 (2017)

Choose crops below then build a query with the questions below by pressing the → buttons.

Queries appear here....

Subcellular Location Protein Properties Homology Publications Blast

Find proteins where the... (To start a query or add another filter to your query select a filter below and press the → button.)

→ Search for or exclude proteins by keywords. A search will be conducted against the descriptions of proteins in the CropPAL database. The syntax of this search supports extended regular expressions (see this site for more information). Choosing matches will give you access to the match syntax of MySQL, e.g. entering +leaf –seed* in the keyword(s) box matches a description that contains leaf but that does not contain seed, seeds, or seedling etc.

... protein description keyword(s)

→ Filter for proteins based on various numeric data derived from sequence data. GRAVY is defined in Kyte J., Doolittle R.F.: Mol. Biol. 157:105-132(1982) "A simple method for displaying the hydropathic character of a protein" doi:10.1016/0022-2836(82)90515-0. PMID: 7108955.

... physical property of is ← Should be a number

→ Filter for proteins that are translated from genes on a specific chromosome or assembly (or set of scaffolds!).

... gene model is on

→

Search for proteins that are (or are not) in a list of Identifiers. Enter this list of Identifiers into the box below. See here for a summary of known cross references.

You can use "wildcards" with "like" and "not like" e.g. GO:%.

... EnsemblPlants identifier(s), alias or protein sequence feature is the list of: (e.g. GO:0008270, IPR017986 etc.)

Find proteins where the... (To start a query or add another filter to your query select a filter below and press the → button.)

→

... author's name like

→

Select a paper by pubmed

→

Select a paper by author name

→

... year of publication of localisation studies is between and ← Should be a year

→ Search for or exclude proteins by keywords. A search will be conducted against the literature titles and abstracts in the SUBA database. The syntax of this search supports extended regular expressions (see this site for more information). Choosing matches will give you access to the match syntax of MySQL, e.g. entering +leaf –seed* in the keyword(s) box matches a title/abstract that contains leaf but that does not contain seed, seeds, or seedling etc.

... publication title or abstract of the localisation study the keyword(s)

→

... author's affiliation

→

... author's affiliation in

→ Search for proteins by author affiliation

... author's afflilation in

Find proteins where the... (To start a query or add another filter to your query select a filter below and press the → button.)

→ Search for proteins that match this blast. Press the ‘Clear’ button to delete any content from the box.

... Protein contains fragments in list with bit-score ...

Bit Score is log₂N_eff-log₂(E-value) where E-value = p_val × N_eff is the p-value times the effective search space size. The larger the bit-score the better since p_val = P(random seq having a better score) = 2^-(bit-score). The p-value measures the statistical significance of the match but since we tried N_eff times to find a match we need to make a correction. Multiplying by the number of possible matches gives the e-value or the expected number of hits with a better match just by random chance. (See here and here [PDF]).

Subcellular Location of Plant-Derived Proteins

Reciprocal Blast

EnsemblPlants Homology Tree