Fetching connectivity#

Some background#

Thanks to Buhmann et al., 2019 we have a synapse prediction for the FAFB dataset.

To facilitate querying that data for FlyWire, Sven Dorkenwald, Forrest Collman et al. have loaded these data into their CAVE annotation backend. fafbseg.flywire provides a couple convenient wrappers to fetch synapses and connectivity.

Here, we will focus on querying connectivity for the FlyWire project and for that a bit of background information is in order:

The synapse predictions are actually ~130M connections (220M before filtering) each of which consists of a presynaptic and a postsynaptic x/y/z coordinate plus some meta data. These coordinates have been mapped to FlyWire supervoxel IDs. The tricky (read: expensive/slow) part is mapping these static supervoxels to the ever changing root IDs. This step is being done by the “materialization” CAVE engine.

These materializations are run once every night and kept for a couple days before being discarded. Some versions - like the FlyWire public release a.k.a. materialization version 630 - are immortal.

The important thing to keep in mind is: each materialization represent a snapshot for the root IDs that existed at that time.

Let’s start with an example of how you can check the dates:

>>> from fafbseg import flywire

To illustrate let’s fetch the currently available materializations. Note that this will only work if you have production data access!

>>> flywire.get_materialization_versions(dataset="production")
is_merged datastack id status version time_stamp expires_on valid
0 False flywire_fafb_production 823 AVAILABLE 833 2024-01-09 05:10:00 2024-01-11 04:10:00 True
1 False flywire_fafb_production 822 AVAILABLE 832 2024-01-06 05:10:00 2024-01-13 04:10:00 True
2 False flywire_fafb_production 820 AVAILABLE 830 2024-01-03 05:10:00 2024-02-07 04:10:00 True
3 False flywire_fafb_production 813 AVAILABLE 823 2023-12-20 05:10:00 2024-01-17 04:10:00 True
4 False flywire_fafb_production 773 AVAILABLE 783 2023-09-30 05:10:00 2121-11-10 07:10:00 True
5 True flywire_fafb_production 619 AVAILABLE 630 2023-03-21 08:10:00 2121-11-10 07:10:00 True
6 True flywire_fafb_production 560 AVAILABLE 571 2023-01-10 08:11:00 2121-11-10 07:10:00 True
7 True flywire_fafb_production 515 AVAILABLE 526 2022-11-17 08:10:00 2121-11-10 07:10:00 True
8 True flywire_fafb_production 247 AVAILABLE 258 2022-01-17 08:10:00 2121-02-18 08:10:00 True

Do the same for the public release - note that at the moment it contains only version 630.

>>> flywire.get_materialization_versions(dataset="public")
is_merged datastack id status version time_stamp expires_on valid
0 True flywire_fafb_public 718 AVAILABLE 630 2023-03-21 08:10:00 2121-11-10 07:10:00 True

For the production dataset only: we can also project the latest materialization into the now and get a “live” version for the current root IDs.

As of version 1.14.0, fafbseg will try to automatically find a materialization that contains your query root IDs. In general, however, it’s good practive to make sure that your query root IDs match one of the existing materialization versions.

Let’s use the public release data to explore this:

>>> flywire.set_default_dataset("public")
Default dataset set to "public"

Synapses#

OK, let’s start by fetching synapses for a given root ID

>>> syn = flywire.get_synapses(
...     720575940625431866,
...     materialization="auto"  # "auto" is the default value
... )
>>> syn.head()
Using materialization version 630
pre post cleft_score pre_x pre_y pre_z post_x post_y post_z id
0 720575940618561403 720575940625431866 144 479728 242036 23760 479856 242028 23760 56078503
1 720575940631121739 720575940625431866 63 475736 237908 21800 475716 237808 21760 111010940
2 720575940612003441 720575940625431866 142 437296 131392 185360 437408 131420 185400 183931595
3 720575940618654016 720575940625431866 144 348664 147136 162720 348688 147224 162720 24048637
4 720575940626518302 720575940625431866 129 485796 248296 28200 485840 248120 28280 3443470

Note how it says “Using materialization version 630” - that’s because this is the only available materialization for the public release. If you try to query root IDs that did not exist at this materialization it will complain!

We can also force a query against the live data. Note that this will be considerably slower than queries against a materialized table! Importantly, this only works with if you have access to the production dataset.

>>> syn_live = flywire.get_synapses(
...     720575940625431866, materialization="live", filtered=False, dataset="production"
... )
>>> syn_live.head()
pre post cleft_score pre_x pre_y pre_z post_x post_y post_z id
0 720575940614727903 720575940625431866 0 503800 205396 32840 503672 205416 32800 71828773
1 720575940633938349 720575940625431866 41 486888 246616 33240 486868 246484 33240 71914527
2 720575940631975884 720575940625431866 83 480208 244212 30080 480144 244384 30040 191440776
3 720575940617537570 720575940625431866 166 480960 242752 23840 481092 242724 23800 56078511
4 720575940613489617 720575940625431866 142 475384 235384 25960 475496 235376 25960 52231029

Using a root ID that has no materialization, or multiple root IDs that don’t have any materialization in common results in an error:

>>> # These are two root IDs for the same neuron (one old, one current)
>>> # THIS IS EXPECTED TO FAIL!
>>> syn = flywire.get_synapses([720575940618984129, 720575940620240833], materialization="auto")
---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

Cell In[10], line 3
      1 # These are two root IDs for the same neuron (one old, one current)
      2 # THIS IS EXPECTED TO FAIL!
----> 3 syn = flywire.get_synapses([720575940618984129, 720575940620240833], materialization="auto")


File ~/Google Drive/Cloudbox/Github/fafbseg-py/fafbseg/flywire/annotations.py:122, in parse_neuroncriteria.<locals>.outer.<locals>.inner(*args, **kwargs)
    120         kwargs[key] = nc.get_roots()
    121 try:
--> 122     return func(*args, **kwargs)
    123 except NoMatchesError as e:
    124     if allow_empty:


File ~/Google Drive/Cloudbox/Github/fafbseg-py/fafbseg/flywire/utils.py:166, in inject_dataset.<locals>.outer.<locals>.inner(*args, **kwargs)
    164 if disallowed and ds in disallowed:
    165     raise ValueError(f'Dataset "{ds}" not allowed for function {func}.')
--> 166 return func(*args, **kwargs)


File ~/Google Drive/Cloudbox/Github/fafbseg-py/fafbseg/flywire/synapses.py:466, in get_synapses(x, pre, post, attach, filtered, min_score, transmitters, neuropils, clean, materialization, batch_size, dataset, progress)
    463     materialization = client.materialize.most_recent_version()
    465 if materialization == "auto":
--> 466     materialization = find_mat_version(ids, dataset=dataset, verbose=progress)
    467 else:
    468     _check_ids(ids, materialization=materialization, dataset=dataset)


File ~/Google Drive/Cloudbox/Github/fafbseg-py/fafbseg/flywire/utils.py:707, in find_mat_version(ids, verbose, allow_multiple, raise_missing, dataset)
    703     raise ValueError('Given root IDs do not co-exist in any of the available '
    704                     'materialization versions (including live). Try updating '
    705                     'root IDs and rerun your query.')
    706 else:
--> 707     raise ValueError('Given root IDs do not co-exist in any of the available '
    708                     'public materialization versions. Please make sure that '
    709                     'the root IDs do exist and rerun your query.')


ValueError: Given root IDs do not co-exist in any of the available public materialization versions. Please make sure that the root IDs do exist and rerun your query.

See the last line in the error message? It tells you that these IDs never co-existed and hence can not be queried together.

Instead of using “auto” you can also provide a specific materialization version. In that case you will get a warning if your queries don’t match the materialization but it won’t throw an exception:

>>> syn = flywire.get_synapses(720575940620240833, materialization=630)
>>> syn.head()
Some root IDs were already outdated at materialization 630 and synapse/connectivity data will be inaccurrate:

 720575940620240833

Try updating the root IDs using flywire.update_ids or flywire.supervoxels_to_roots if you have supervoxel IDs, or pick a different materialization version.
pre post cleft_score pre_x pre_y pre_z post_x post_y post_z id

Note that in this example we got an empty dataframe because the root ID didn’t exist at that specific materialization!

See fafbseg.flywire.is_latest_root() and fafbseg.flywire.update_ids() for mapping root IDs to materializations:

>>> # Search for this neuron's ID at materialization 630 (public release version)
>>> new_id = flywire.update_ids(720575940620240833, timestamp="mat_630")
>>> new_id
old_id new_id confidence changed
0 720575940620240833 720575940622894616 0.991647 True

With the new up-to-date root ID, the query will work as expected:

>>> syn = flywire.get_synapses(720575940622894616)
>>> syn.head()
Using materialization version 630
pre post cleft_score pre_x pre_y pre_z post_x post_y post_z id
0 720575940627172100 720575940622894616 142 384712 159552 184160 384840 159424 184200 166252986
1 720575940634147936 720575940622894616 135 386928 156476 181800 386904 156556 181840 155850631
2 720575940632826925 720575940622894616 144 341452 147360 173040 341424 147504 173040 228310877
3 720575940608970197 720575940622894616 116 351400 163320 187320 351492 163280 187360 66153949
4 720575940624632247 720575940622894616 145 358000 145404 163080 357864 145396 163080 87457018

Alternatively: if you keep track of your neurons via x/y/z coordinates or supervoxel IDs, you can also use the timestamp parameter in e.g. fafbseg.flywire.supervoxels_to_roots() to get the root IDs at the same time as one of the materializiations. This can be slightly faster than live queries.

Neurons & synapses#

You find yourself wanting to associate synapses with the neuron’s skeleton or mesh - e.g. for plotting or analyses. Here’s how you would do that:

>>> # Fetch a skeleton
>>> sk = flywire.get_skeletons(720575940625431866)
>>> # The skeleton does not yet have any synapse data associated with it
>>> sk.connectors

To fetch the neuron’s synapses and attach them to the skeleton you can do this:

>>> flywire.get_synapses(sk, attach=True)
>>> # Now the neuron has a connector table
>>> sk.connectors.head()
Using materialization version 630
connector_id x y z cleft_score partner_id type node_id
0 0 378580 141040 176720 151 720575940486790143 pre 324
1 1 448544 121840 192120 161 720575940617293396 pre 706
2 2 449832 122628 195640 157 720575940659323009 pre 704
3 3 446608 124216 195760 151 720575940659323009 pre 652
4 4 361652 129832 159520 61 720575940532780653 pre 265

Let’s illustrate by plotting the skeleton plus its synapses:

>>> import navis
>>> # Presynapses are red; postsynapses are light blue
>>> fig, ax = navis.plot2d(sk, connectors=True, color='lightgrey')
../../_images/flywire_connectivity_25_0.png

Connections#

Above you learned how to fetch individual synapses. You could distil those into an edge list by aggregating over the pre- and postsynaptic root IDs but that’s rather tedious. Fortunately, there is an easier (and faster) way of doing it:

>>> # These are current root IDs of RHS DA1 uPNs
>>> da1_roots = [
...     720575940604407468,
...     720575940623543881,
...     720575940637469254,
...     720575940617229632,
...     720575940621239679,
...     720575940623303108,
...     720575940630066007,
... ]

>>> # Get the partners for these neurons
>>> edge_list = flywire.get_connectivity(da1_roots)
>>> edge_list.head()
Using materialization version 630
pre post weight
0 720575940611174702 720575940630066007 104
1 720575940611174702 720575940621239679 89
2 720575940619061662 720575940630066007 81
3 720575940611174702 720575940637469254 78
4 720575940611174702 720575940604407468 75

Note

You can also use NeuronCriteria to work with annotations instead of passing root IDs directly. See the annotation tutorial for details.

Generate a neuroglancer link with the top 30 downstream partners:

>>> # Grab the top 30 downstream partnes
>>> top30ds = edge_list[edge_list.pre.isin(da1_roots)].iloc[:30]
>>> # Generate a URL with these partners
>>> flywire.encode_url(segments=top30ds.post.values, open=False)
'https://ngl.flywire.ai/?json_url=https://globalv1.flywire-daf.com/nglstate/5912498324635648'

We can also generate an adjacency matrix:

>>> adj = flywire.get_adjacency(sources=da1_roots, targets=da1_roots)
>>> adj
Using materialization version 630
target 720575940604407468 720575940623543881 720575940637469254 720575940617229632 720575940621239679 720575940623303108 720575940630066007
source
720575940604407468 0.0 2.0 1.0 2.0 1.0 1.0 1.0
720575940623543881 0.0 0.0 2.0 0.0 0.0 2.0 0.0
720575940637469254 0.0 2.0 0.0 3.0 0.0 1.0 1.0
720575940617229632 1.0 2.0 1.0 0.0 2.0 1.0 1.0
720575940621239679 6.0 1.0 1.0 5.0 0.0 0.0 2.0
720575940623303108 0.0 1.0 1.0 0.0 1.0 0.0 0.0
720575940630066007 0.0 4.0 0.0 1.0 3.0 0.0 0.0
>>> import seaborn as sns
>>> ax = sns.heatmap(adj, cmap="Blues", square=True)
../../_images/flywire_connectivity_32_0.png

I hope this tutorial gave you a flavor and some entry points to start using these data. Please see the API reference for a full list of connectivity-related functions.