Working with annotations#

Important

This tutorial is written for fafbseg version >=3.0.0 which made substantial changes to how annotations are handled. Please make sure your fafbseg version is up to date.

At this point in time, there are three large sources for annotations/meta data for FlyWire neurons:

  1. Free-form community annotations every user can place through the neuroglancer UI

  2. Systematic annotations (classes, sides, types, etc) for the entire brain Schlegel et al

  3. Systematic cell types for the right optic lobe from Matsliah et al

In Codex, (1) is what you see in the “Community labels” column while (2) and (3) are used to fill the “Classification” and “Type” columns.

Currently fafbseg allows you to query (1) and (2) programmatically. (3) is not yet supported.

>>> from fafbseg import flywire

>>> # We'll be demonstrating this using the public release dataset
>>> flywire.set_default_dataset("public")
Default dataset set to "public"

Query community annotations (1) for a single neuron:

>>> ann = flywire.search_community_annotations(720575940625431866)
>>> ann.head()
Using materialization version 630
Caching community annotations for materialization version "630"... Done.
id pt_position_x pt_position_y pt_position_z pt_supervoxel_id pt_root_id tag user user_id
21364 22248 485772 238836 49920 78957304514692633 720575940625431866 ALad1; right; acetylcholine Alexander Bates 355
21365 22249 485772 238836 49920 78957304514692633 720575940625431866 ALad1; right; acetylcholine Lab Members 1063
51385 49507 485776 238836 49920 78957304514685298 720575940625431866 ALPN,VM5v_adPN Philipp Schlegel 62

Query the hierarchical annotations (2) for the same neuron:

>>> ann = flywire.search_annotations(720575940625431866)
>>> ann.head()
Using cached materialization version 630
supervoxel_id root_id pos_x pos_y pos_z soma_x soma_y soma_z nucleus_id flow ... ito_lee_hemilineage hartenstein_hemilineage morphology_group top_nt top_nt_conf side nerve vfb_id fbbt_id status
0 78957304514685298 720575940625431866 121444 59709 1248 125088.0 52048.0 782.0 4518783.0 intrinsic ... ALad1 BAmv3 ALad1__1 acetylcholine 0.957189 left NaN fw036274 FBbt_00100386 NaN

1 rows × 25 columns

For the hierarchical annotations we expect to get only one row per neuron. Let’s inspect that first row:

>>> ann.iloc[0]
supervoxel_id               78957304514685298
root_id                    720575940625431866
pos_x                                  121444
pos_y                                   59709
pos_z                                    1248
soma_x                               125088.0
soma_y                                52048.0
soma_z                                  782.0
nucleus_id                          4518783.0
flow                                intrinsic
super_class                           central
cell_class                               ALPN
cell_sub_class                  uniglomerular
cell_type                                 NaN
hemibrain_type                      VM5v_adPN
ito_lee_hemilineage                     ALad1
hartenstein_hemilineage                 BAmv3
morphology_group                     ALad1__1
top_nt                          acetylcholine
top_nt_conf                          0.957189
side                                     left
nerve                                     NaN
vfb_id                               fw036274
fbbt_id                         FBbt_00100386
status                                    NaN
Name: 0, dtype: object

OK, we get a bunch of useful meta data for our neuron of interest.

What if we’re looking for e.g. a specific cell type instead? Easy:

>>> # Search community annotations for a given tag
>>> ann = flywire.search_community_annotations("VM5v_adPN")
>>> ann.head()
Caching community annotations for materialization version "630"... Done.
id pos_x pos_y pos_z supervoxel_id root_id tag user user_id
51343 48935 142332 61350 928 80435185514499960 720575940622287142 ALPN,VM5v_adPN Philipp Schlegel 62
51385 49507 121444 59709 1248 78957304514685298 720575940625431866 ALPN,VM5v_adPN Philipp Schlegel 62
51389 49514 142668 61600 891 80435185514491668 720575940624661552 ALPN,VM5v_adPN Philipp Schlegel 62
51620 49092 120504 60065 1194 78887004489925320 720575940620189790 ALPN,VM5v_adPN Philipp Schlegel 62
51998 49429 142744 62078 944 80435185514524409 720575940619637780 ALPN,VM5v_adPN Philipp Schlegel 62
>>> # Look for a specific string among the hierarchical annotations
>>> ann = flywire.search_annotations("VM5v_adPN")
>>> ann.head()
Using materialization version 630
supervoxel_id root_id pos_x pos_y pos_z soma_x soma_y soma_z nucleus_id flow ... hemibrain_type ito_lee_hemilineage hartenstein_hemilineage morphology_group top_nt top_nt_conf side nerve fbbt_id status
561 80435185514499960 720575940622287142 142332 61350 928 139912.0 51768.0 698.0 5057638.0 intrinsic ... VM5v_adPN ALad1 BAmv3 ALad1_1 acetylcholine 0.938421 right NaN FBbt_00100386 NaN
11874 78887004489925320 720575940620189790 120504 60065 1194 124824.0 49920.0 998.0 4518437.0 intrinsic ... VM5v_adPN ALad1 BAmv3 ALad1_1 acetylcholine 0.948491 left NaN FBbt_00100386 NaN
34685 80435185514524409 720575940619637780 142744 62078 944 140752.0 52368.0 374.0 5739285.0 intrinsic ... VM5v_adPN ALad1 BAmv3 ALad1_1 acetylcholine 0.918128 right NaN FBbt_00100386 NaN
35972 78957304447921539 720575940610505170 121284 59660 1036 124040.0 49616.0 906.0 4515954.0 intrinsic ... VM5v_adPN ALad1 BAmv3 ALad1_1 acetylcholine 0.949083 left NaN FBbt_00100386 NaN
36182 78957304514685298 720575940625431866 121444 59709 1248 125088.0 52048.0 782.0 4518783.0 intrinsic ... VM5v_adPN ALad1 BAmv3 ALad1_1 acetylcholine 0.954871 left NaN FBbt_00100386 NaN

5 rows × 24 columns

For the hierarchical annotations we can also run queries against specific fields:

>>> # Fetch all gustatory sensory neurons
>>> gust = flywire.search_annotations("cell_class:gustatory")
>>> gust.head()
Using materialization version 630
supervoxel_id root_id pos_x pos_y pos_z soma_x soma_y soma_z nucleus_id flow ... ito_lee_hemilineage hartenstein_hemilineage morphology_group top_nt top_nt_conf side nerve vfb_id fbbt_id status
0 79098797984212565 720575940628673474 123538 70948 1630 NaN NaN NaN NaN afferent ... NaN NaN NaN acetylcholine 0.701929 left PhN fw039687 NaN NaN
1 79240016576488341 720575940621375231 125710 77813 2496 NaN NaN NaN NaN afferent ... NaN NaN NaN acetylcholine 0.623311 left MxLbN fw043484 NaN NaN
2 78888585171909593 720575940625507870 120188 83973 2069 NaN NaN NaN NaN afferent ... NaN NaN NaN acetylcholine 0.716926 left MxLbN fw043485 NaN NaN
3 78888516385974174 720575940635791167 120512 82676 2063 NaN NaN NaN NaN afferent ... NaN NaN NaN acetylcholine 0.747361 left MxLbN fw043486 NaN NaN
4 78747710244640262 720575940629270339 118458 82316 2109 NaN NaN NaN NaN afferent ... NaN NaN NaN acetylcholine 0.727190 left MxLbN fw043487 NaN NaN

5 rows × 25 columns

What if you have some more complicated query in mind? No problem, you can use NeuronCriteria to build up your query:

>>> # Fetch all DA1 projection neurons on the left side
>>> NC = flywire.NeuronCriteria
>>> da1 = flywire.search_annotations(NC(type='DA1_lPN', side='left'))
>>> da1
Found 8 neurons matching the given criteria.
Using cached materialization version 630
supervoxel_id root_id pos_x pos_y pos_z soma_x soma_y soma_z nucleus_id flow ... ito_lee_hemilineage hartenstein_hemilineage morphology_group top_nt top_nt_conf side nerve vfb_id fbbt_id status
0 78253479633951114 720575940614309535 111226 57542 1253 101328.0 57432.0 1311.0 2469243.0 intrinsic ... ALl1_ventral BAlc_ventral ALl1_ventral__7 acetylcholine 0.929829 left NaN fw012440 FBbt_00067363 NaN
1 78323710939184200 720575940619385765 111620 55558 1239 101816.0 55568.0 1655.0 2469812.0 intrinsic ... ALl1_ventral BAlc_ventral ALl1_ventral__7 acetylcholine 0.909374 left NaN fw032945 FBbt_00067363 NaN
2 78323848378217543 720575940626034819 111774 57252 1326 100824.0 56608.0 1386.0 2468760.0 intrinsic ... ALl1_ventral BAlc_ventral ALl1_ventral__7 acetylcholine 0.894545 left NaN fw033427 FBbt_00067363 NaN
3 78464860744572652 720575940621185050 113815 61631 1390 102688.0 55800.0 1508.0 2471407.0 intrinsic ... ALl1_ventral BAlc_ventral ALl1_ventral__7 acetylcholine 0.930650 left NaN fw033526 FBbt_00067363 NaN
4 78323779658725881 720575940613345442 111894 56379 1297 101168.0 54536.0 1584.0 2468972.0 intrinsic ... ALl1_ventral BAlc_ventral ALl1_ventral__7 acetylcholine 0.916676 left NaN fw034548 FBbt_00067363 NaN
5 78605529513349676 720575940637208718 115991 60386 1314 105608.0 55088.0 1687.0 2476421.0 intrinsic ... ALl1_ventral BAlc_ventral ALl1_ventral__7 acetylcholine 0.928285 left NaN fw035057 FBbt_00067363 NaN
6 78323779658618911 720575940603231916 112038 56471 1216 100672.0 54976.0 1388.0 2468593.0 intrinsic ... ALl1_ventral BAlc_ventral ALl1_ventral__7 acetylcholine 0.938961 left NaN fw035224 FBbt_00067363 NaN
7 78464654585926444 720575940605102694 113795 58276 1223 103744.0 60904.0 1295.0 2532594.0 intrinsic ... ALl1_ventral BAlc_ventral ALl1_ventral__7 acetylcholine 0.899551 left NaN fw036329 FBbt_00067363 NaN

8 rows × 25 columns

Note

In above example we constructed a query against type. This is a special argument that searches against all *_type fields in the annotations (i.e. currently against both cell_type and hemibrain_type).

NeuronCriteria can also be passed to many other functions that previously accepted only straight root IDs. For example, we can use it to fetch all downstream partners of our neuron of interest:

>>> ds = flywire.get_connectivity(NC(type='DA1_lPN', side='left'),
...                               upstream=False,
...                               downstream=True)
>>> ds.head()
Found 8 neurons matching the given criteria.
Using materialization version 630
pre post weight
0 720575940605102694 720575940646122804 64
1 720575940603231916 720575940629163931 50
2 720575940603231916 720575940630610425 46
3 720575940603231916 720575940646122804 42
4 720575940614309535 720575940630610425 39

Or we can use it to fetch skeletons for all neurons of this type:

>>> sk = flywire.get_skeletons(NC(type='DA1_lPN',
...                               side='left',
...                               materialization=630),
...                            dataset=630)
>>> sk
Found 8 neurons matching the given criteria.
<class 'navis.core.neuronlist.NeuronList'> containing 8 neurons (663.5KiB)
type name id n_nodes n_connectors n_branches n_leafs cable_length soma units
0 navis.TreeNeuron skeleton 720575940603231916 2499 None 594 652 1995934.875 [104, 311, 313, 315, 317, 318, 320, 323, 894, ... 1 nanometer
1 navis.TreeNeuron skeleton 720575940605102694 3291 None 914 1026 2545239.250 [9, 71, 75, 83, 86, 95, 111, 112, 115, 128, 15... 1 nanometer
... ... ... ... ... ... ... ... ... ... ...
6 navis.TreeNeuron skeleton 720575940626034819 2387 None 531 578 1940427.250 [21, 153, 158, 252, 253, 254, 256, 912, 958, 9... 1 nanometer
7 navis.TreeNeuron skeleton 720575940637208718 2857 None 694 758 2260301.500 [187, 565, 572, 582, 981, 1017, 1042, 1045, 10... 1 nanometer

Please see the API reference for a full list of annotation-related functions.

A couple final notes#

Community annotations are pulled straight from CAVE and then cached for the duration of your Python session.

The systematic annotations are downloaded from the Github repository for the Schlegel et al annotation paper at flyconnectome/flywire_annotations (see Supplemental_file1_neuron_annotations.tsv) and subsequently cached on disk.

At the time of writing, the repository already contains 3 versions of the annotations (v1.0.0, v1.1.0 and v2.0.0; see “Releases” on the repository) and we’re planning to release new versions with updated labels in the future. By default, fafbseg will use the latest available version but you can also specify which version you want to use (see set_default_annotation_version()).

These annotation versions are _technically_ independent of the segmentation since we can update the root IDs to match any materialization version - fafbseg is doing that for you under the hood. In practice, however, these labels were generated using the segmentation at a specific point in time. For example, v2.0.0 of the annotations is based on materialization 783. If you use them in conjunction with a different materialization, you might find incorrect labels - in particular if you use an earlier, less proofread materialization!

We would love to incorporate contributions from the community! If you find anything wrong/missing or you would like to add e.g. your own cell types, please get in touch either via email or via an issue in the Github repository.