Working with annotations

Important

This tutorial is written for fafbseg version >=3.0.0 which made substantial changes to how annotations are handled. Please make sure your fafbseg version is up to date.

At this point in time, there are three large sources for annotations/meta data for FlyWire neurons:

  1. Free-form community annotations every user can place through the neuroglancer UI

  2. Systematic annotations (classes, sides, types, etc) for the entire brain Schlegel et al

  3. Systematic cell types for the right optic lobe from Matsliah et al

In Codex, (1) is what you see in the “Community labels” column while (2) and (3) are used to fill the “Classification” and “Type” columns.

Currently fafbseg allows you to query (1) and (2) programmatically. (3) is not yet supported.

>>> from fafbseg import flywire

>>> # We'll be demonstrating this using the public release dataset
>>> flywire.set_default_dataset("public")
Default dataset set to "public".

Query community annotations (1) for a single neuron:

>>> ann = flywire.search_community_annotations(720575940625431866)
>>> ann.head()
Using materialization version 783.
Caching community annotations for materialization version "783"... Done.
id created superceded_id pt_position_x pt_position_y pt_position_z tag user user_id pt_supervoxel_id pt_root_id user_name
0 22248 2022-02-07 05:30:08.054976+00:00 NaN 485772 238836 49920 ALad1; right; acetylcholine Alexander Bates 355 78957304514692633 720575940625431866 Alexander Bates
1 22249 2022-02-07 05:30:08.056123+00:00 NaN 485772 238836 49920 ALad1; right; acetylcholine Lab Members 1063 78957304514692633 720575940625431866 Lab Members
2 49507 2022-04-27 14:36:19.985327+00:00 NaN 485776 238836 49920 ALPN,VM5v_adPN Philipp Schlegel 62 78957304514685298 720575940625431866 Philipp Schlegel

Query the hierarchical annotations (2) for the same neuron:

>>> ann = flywire.search_annotations(720575940625431866)
>>> ann.head()
Using annotation version "latest commit" (b2bceba) from https://github.com/flyconnectome/flywire_annotations.
Using materialization version 783.
supervoxel_id root_id pos_x pos_y pos_z soma_x soma_y soma_z nucleus_id flow ... morphology_group top_nt top_nt_conf known_nt known_nt_source side nerve vfb_id fbbt_id status
0 78957304514685298 720575940625431866 121444 59709 1248 125088.0 52048.0 782.0 4518783.0 intrinsic ... ALad1__1 acetylcholine 0.957189 acetylcholine Tanaka et al., 2012 left NaN fw036274 FBbt_00100386 NaN

1 rows × 27 columns

For the hierarchical annotations we expect to get only one row per neuron. Let’s inspect that first row:

>>> ann.iloc[0]
supervoxel_id                78957304514685298
root_id                     720575940625431866
pos_x                                   121444
pos_y                                    59709
pos_z                                     1248
soma_x                                125088.0
soma_y                                 52048.0
soma_z                                   782.0
nucleus_id                           4518783.0
flow                                 intrinsic
super_class                            central
cell_class                                ALPN
cell_sub_class                   uniglomerular
cell_type                                  NaN
hemibrain_type                       VM5v_adPN
ito_lee_hemilineage                      ALad1
hartenstein_hemilineage                  BAmv3
morphology_group                      ALad1__1
top_nt                           acetylcholine
top_nt_conf                           0.957189
known_nt                         acetylcholine
known_nt_source            Tanaka et al., 2012
side                                      left
nerve                                      NaN
vfb_id                                fw036274
fbbt_id                          FBbt_00100386
status                                     NaN
Name: 0, dtype: object

OK, we get a bunch of useful meta data for our neuron of interest.

What if we’re looking for e.g. a specific cell type instead? Easy:

>>> # Search community annotations for a given tag
>>> ann = flywire.search_community_annotations("VM5v_adPN")
>>> ann.head()
Caching community annotations for materialization version "783"... Done.
id created superceded_id pos_x pos_y pos_z tag user user_id supervoxel_id root_id user_name
0 48935 2022-04-27 14:21:54.766389+00:00 NaN 142332 61350 928 ALPN,VM5v_adPN Philipp Schlegel 62 80435185514499960 720575940622287142 Philipp Schlegel
1 49092 2022-04-27 14:26:00.639112+00:00 NaN 120504 60065 1194 ALPN,VM5v_adPN Philipp Schlegel 62 78887004489925320 720575940620189790 Philipp Schlegel
2 49429 2022-04-27 14:34:21.735996+00:00 NaN 142744 62078 944 ALPN,VM5v_adPN Philipp Schlegel 62 80435185514524409 720575940619637780 Philipp Schlegel
3 49507 2022-04-27 14:36:19.985327+00:00 NaN 121444 59709 1248 ALPN,VM5v_adPN Philipp Schlegel 62 78957304514685298 720575940625431866 Philipp Schlegel
4 49495 2022-04-27 14:36:02.233129+00:00 NaN 121284 59660 1036 ALPN,VM5v_adPN Philipp Schlegel 62 78957304447921539 720575940610505170 Philipp Schlegel
>>> # Look for a specific string among the hierarchical annotations
>>> ann = flywire.search_annotations("VM5v_adPN")
>>> ann.head()
Using materialization version 783.
supervoxel_id root_id pos_x pos_y pos_z soma_x soma_y soma_z nucleus_id flow ... morphology_group top_nt top_nt_conf known_nt known_nt_source side nerve vfb_id fbbt_id status
0 80435185514499960 720575940622287142 142332 61350 928 139912.0 51768.0 698.0 5057638.0 intrinsic ... ALad1__1 acetylcholine 0.943605 acetylcholine Tanaka et al., 2012 right NaN fw000568 FBbt_00100386 NaN
1 78887004489925320 720575940620189790 120504 60065 1194 124824.0 49920.0 998.0 4518437.0 intrinsic ... ALad1__1 acetylcholine 0.950719 acetylcholine Tanaka et al., 2012 left NaN fw012119 FBbt_00100386 NaN
2 80435185514524409 720575940619637780 142744 62078 944 140752.0 52368.0 374.0 5739285.0 intrinsic ... ALad1__1 acetylcholine 0.920976 acetylcholine Tanaka et al., 2012 right NaN fw034817 FBbt_00100386 NaN
3 78957304447921539 720575940610505170 121284 59660 1036 124040.0 49616.0 906.0 4515954.0 intrinsic ... ALad1__1 acetylcholine 0.951816 acetylcholine Tanaka et al., 2012 left NaN fw036072 FBbt_00100386 NaN
4 78957304514685298 720575940625431866 121444 59709 1248 125088.0 52048.0 782.0 4518783.0 intrinsic ... ALad1__1 acetylcholine 0.957189 acetylcholine Tanaka et al., 2012 left NaN fw036274 FBbt_00100386 NaN

5 rows × 27 columns

For the hierarchical annotations we can also run queries against specific fields:

>>> # Fetch all gustatory sensory neurons
>>> gust = flywire.search_annotations("cell_class:gustatory")
>>> gust.head()
Using materialization version 783.
supervoxel_id root_id pos_x pos_y pos_z soma_x soma_y soma_z nucleus_id flow ... morphology_group top_nt top_nt_conf known_nt known_nt_source side nerve vfb_id fbbt_id status
0 79098797984212565 720575940628673474 123538 70948 1630 NaN NaN NaN NaN afferent ... NaN acetylcholine 0.701929 NaN NaN left PhN fw039687 NaN NaN
1 79240016576488341 720575940621375231 125710 77813 2496 NaN NaN NaN NaN afferent ... NaN acetylcholine 0.623311 NaN NaN left MxLbN fw043484 NaN NaN
2 78888585171909593 720575940625507870 120188 83973 2069 NaN NaN NaN NaN afferent ... NaN acetylcholine 0.716926 NaN NaN left MxLbN fw043485 NaN NaN
3 78888516385974174 720575940635791167 120512 82676 2063 NaN NaN NaN NaN afferent ... NaN acetylcholine 0.747361 NaN NaN left MxLbN fw043486 NaN NaN
4 78747710244640262 720575940629270339 118458 82316 2109 NaN NaN NaN NaN afferent ... NaN acetylcholine 0.727190 NaN NaN left MxLbN fw043487 NaN NaN

5 rows × 27 columns

What if you have some more complicated query in mind? No problem, you can use NeuronCriteria to build up your query:

>>> # Fetch all DA1 projection neurons on the left side
>>> NC = flywire.NeuronCriteria
>>> da1 = flywire.search_annotations(NC(type='DA1_lPN', side='left'))
>>> da1
Found 8 neurons matching the given criteria.
Using materialization version 783.
supervoxel_id root_id pos_x pos_y pos_z soma_x soma_y soma_z nucleus_id flow ... morphology_group top_nt top_nt_conf known_nt known_nt_source side nerve vfb_id fbbt_id status
0 78253479633951114 720575940614309535 111226 57542 1253 101328.0 57432.0 1311.0 2469243.0 intrinsic ... ALl1_ventral__7 acetylcholine 0.929829 acetylcholine Tanaka et al., 2012 left NaN fw012440 FBbt_00067363 NaN
1 78323710939184200 720575940619385765 111620 55558 1239 101816.0 55568.0 1655.0 2469812.0 intrinsic ... ALl1_ventral__7 acetylcholine 0.909374 acetylcholine Tanaka et al., 2012 left NaN fw032945 FBbt_00067363 NaN
2 78323848378217543 720575940626034819 111774 57252 1326 100824.0 56608.0 1386.0 2468760.0 intrinsic ... ALl1_ventral__7 acetylcholine 0.894545 acetylcholine Tanaka et al., 2012 left NaN fw033427 FBbt_00067363 NaN
3 78464860744572652 720575940621185050 113815 61631 1390 102688.0 55800.0 1508.0 2471407.0 intrinsic ... ALl1_ventral__7 acetylcholine 0.930650 acetylcholine Tanaka et al., 2012 left NaN fw033526 FBbt_00067363 NaN
4 78323779658725881 720575940613345442 111894 56379 1297 101168.0 54536.0 1584.0 2468972.0 intrinsic ... ALl1_ventral__7 acetylcholine 0.916676 acetylcholine Tanaka et al., 2012 left NaN fw034548 FBbt_00067363 NaN
5 78605529513349676 720575940637208718 115991 60386 1314 105608.0 55088.0 1687.0 2476421.0 intrinsic ... ALl1_ventral__7 acetylcholine 0.928285 acetylcholine Tanaka et al., 2012 left NaN fw035057 FBbt_00067363 NaN
6 78323779658618911 720575940603231916 112038 56471 1216 100672.0 54976.0 1388.0 2468593.0 intrinsic ... ALl1_ventral__7 acetylcholine 0.938961 acetylcholine Tanaka et al., 2012 left NaN fw035224 FBbt_00067363 NaN
7 78464654585926444 720575940605102694 113795 58276 1223 103744.0 60904.0 1295.0 2532594.0 intrinsic ... ALl1_ventral__7 acetylcholine 0.899551 acetylcholine Tanaka et al., 2012 left NaN fw036329 FBbt_00067363 NaN

8 rows × 27 columns

Note

In above example we constructed a query against type. This is a special argument that searches against all *_type fields in the annotations (i.e. currently against both cell_type and hemibrain_type).

NeuronCriteria can also be passed to many other functions that previously accepted only straight root IDs. For example, we can use it to fetch all downstream partners of our neuron of interest:

>>> ds = flywire.get_connectivity(NC(type='DA1_lPN', side='left'),
...                               upstream=False,
...                               downstream=True)
>>> ds.head()
Found 8 neurons matching the given criteria.
Using materialization version 783.
pre post weight
0 720575940605102694 720575940646122804 64
1 720575940603231916 720575940629163931 50
2 720575940603231916 720575940635945919 46
3 720575940603231916 720575940646122804 42
4 720575940614309535 720575940635945919 39

Or we can use it to fetch skeletons for all neurons of this type:

>>> sk = flywire.get_skeletons(NC(type='DA1_lPN',
...                               side='left',
...                               materialization=783))
>>> sk
Found 8 neurons matching the given criteria.
<class 'navis.core.neuronlist.NeuronList'> containing 8 neurons (827.5KiB)
type name id n_nodes n_connectors n_branches n_leafs cable_length soma units
0 navis.TreeNeuron skeleton 720575940603231916 3588 None 586 645 2050971.75 [141, 458, 460, 462, 464, 466, 467, 469, 470, ... 1 nanometer
1 navis.TreeNeuron skeleton 720575940605102694 4596 None 856 971 2598267.50 [121, 125, 128, 130, 131, 149, 161, 173, 183, ... 1 nanometer
... ... ... ... ... ... ... ... ... ... ...
6 navis.TreeNeuron skeleton 720575940626034819 3519 None 522 570 1986607.75 [15, 19, 21, 22, 239, 242, 384, 385, 387, 389,... 1 nanometer
7 navis.TreeNeuron skeleton 720575940637208718 4143 None 688 750 2318905.00 [280, 535, 656, 692, 729, 751, 829, 1355, 1396... 1 nanometer

Please see the API reference for a full list of annotation-related functions.

A couple final notes

Community annotations are pulled straight from CAVE and then cached for the duration of your Python session.

The systematic annotations are downloaded from the Github repository for the Schlegel et al annotation paper at flyconnectome/flywire_annotations (see Supplemental_file1_neuron_annotations.tsv) and subsequently cached on disk.

At the time of writing, the repository already contains 3 versions of the annotations (v1.0.0, v1.1.0 and v2.0.0; see “Releases” on the repository) and we’re planning to release new versions with updated labels in the future. By default, fafbseg will use the latest available data but you can also specify which version you want to use (see set_default_annotation_version()).

These annotation versions are _technically_ independent of the segmentation since we can update the root IDs to match any materialization version - fafbseg is doing that for you under the hood. In practice, however, these labels were generated using the segmentation at a specific point in time. For example, v2.0.0 of the annotations is based on materialization 783. If you use them in conjunction with a different materialization, you might find incorrect labels - in particular if you use an earlier, less proofread materialization!

We would love to incorporate contributions from the community! If you find anything wrong/missing or you would like to add e.g. your own cell types, please get in touch either via email or via an issue in the Github repository.