Version: 3.4

Publishing Data to GBIF

info

This page describes how to publish the data from your collection to global aggregator GBIF.

Collections managed as "live datasets" within a Symbiota portal can immediately publish to GBIF without issues. Collections that make use of an in-house management system (e.g. Specify, Axiell Emu, Filemaker Pro, etc.) and only publish a snapshot of their data to a Symbiota instance can also use the portal to publish their data to GBIF, but only if: 1) they are not publishing their data through another means (e.g. IPT installation, VertNet, etc.), and 2) an occurrenceID) (typically a "GUID") is included in the data being pushed from their in-house database to the Symbiota dataset (see additional instructions below). If the collection is using the Symbiota publishing tool built into Specify, occurrenceID will be automatically included in the data upload from Specify.

note

Your portal must be set up as a GBIF Publishing Installation to publishing your data to GBIF. This can be done by your portal manager.

Workflow for new data publishers

Use these instructions to set up an institutional account with GBIF so that there is a direct publishing agreement established between GBIF and the institution. Since the institutional account may be used to list multiple collection datasets associated with that institution (e.g. https://www.gbif.org/publisher/4c0e9f60-c489-11d8-bf60-b8a03c50a862 ), you should coordinate with other collections within your institution, if applicable. Note that the institutional datasets can be published to GBIF using different publishing resources. For instance, the zoological collections could import their data from VertNet IPT (http://ipt.vertnet.org) or their institutional IPT, vascular plant data from SEINet, and lichens from the Lichen Portal.
- If you are sure your institution is not yet registered, complete the registration form linked above and follow the instructions provided by GBIF.
- If your institution is already registered, review the GBIF metadata for your organization and existing datasets and contact GBIF to make any necessary changes. Be sure that none of the existing datasets contain the same data you are trying to publish. If they do, make the appropriate arrangements with GBIF so that the old dataset can be archived BEFORE re-publishing the new dataset.
Log in to your Symbiota portal, go to your Administrator Control Panel (click My Profile, then the name of the collection in the Collection Management box) and select Edit Metadata.
- Verify that your collection's name and description are accurate (both will be visible on the corresponding GBIF dataset page).
- Select (check) the "Publish to Aggregators" box (for GBIF). If you do not see a GBIF publishing checkbox, contact your Portal Manager and ask them to configure the portal for GBIF publishing.
- Select the "Save Edits" button.
Return to the Administration Contol Panel and navigate to the Darwin Core Archive Publishing link. Click "Create/Refresh Darwin Core Archive" button.
Enter your institution's GBIF publisher key and select the "Validate Key" button. (The GBIF key should have the following format: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, e.g. "4c0e9f60-c489-11d8-bf60-b8a03c50a872"). If your key validates, more instructions will be displayed along with a "Submit Data" button. Alternatively, you can enter the full URL to your GBIF Publisher Page (example), and the key will be automatically extracted.
Before you submit data to GBIF for the first time, you will need to contact GBIF's Help Desk (helpdesk@gbif.org). Look for the recommended draft email message to send to GBIF by following the instructions under the "Validate Key" button.
Once GBIF affirms that the portal has permission to submit data to your GBIF Publisher Page, click the "Submit Data" button. A link to your GBIF dataset will be immediately displayed, though it may take an hour or so for your data to be loaded, indexed, and available.

To refresh your data after first-time publication

Return to the Administration Contol Panel and navigate to the Darwin Core Archive Publishing link. Click "Create/Refresh Darwin Core Archive" button.

Special instructions for snapshots

How do I know if my snapshot already has occurrenceid values?

To determine if your snapshot collection has occurrenceid values in a Symbiota portal, either:

open an existing record in the portal (navigate to the Administration Control Panel: click My Profile, then the name of the collection in the Collection Management box, and then click Edit Existing Occurrence Records and open a catalog record). Scroll down to the Curation section of the Occurrence Editor form and look for a value in the Occurrence ID field. If this box is blank, your records do not have occurrenceid values assigned in the portal.
or download a backup copy of your data, uncompress/unzip the ouput ZIP file and open the "occurrences.csv", and then look for values in the occurrenceid field/column in this file.

tip

If your collection was previously published to an external aggregator like GBIF, all or some of your records likely have already had occurrenceid values assigned. These values must be included with your snapshot data in your portal if you intend to republish these records using Symbiota.

Suggested workflows for populating occurrenceid:

If you are confident that your snapshot's records have never been assigned occurrenceid values, you can backfill this information using one of the following methods.

Option 1: Generate GUIDs outside of Symbiota and then bring them into the portal

In your spreadsheet to be imported, include a column/field called "occurrenceid".
Populate this column using GUIDs generated using a tool like this one (copy-paste into your spreadsheet): guidgenerator.com.
When you upload data from a spreadsheet into the portal, map your new occurrenceid values to the corresponding data import field.
- Important: If you already have data in the portal, select the option to match on your existing catalogNumber or occurrenceid values so that duplicate records are not generated upon import.
Once your records are uploaded, the new GUIDs will appear in the "occurrenceid" field/box on the Occurrence Editor form.

Option 2: Use Symbiota-generated GUIDs

Every time you want to send data to GBIF, email help@symbiota.org to request that the SSH populate the occurrenceid field for you.
Important: Once we populate this field, you will have to remember to download a copy of your data from the portal and add the Symbiota-generated GUIDs to wherever you manage your records outside of Symbiota.

warning

Regardless of the method you choose to populate occurrenceid, the most important thing is to do is make sure that your occurrenceid values (typically GUIDs) are retained with your canonical specimen records wherever you manage them outside of Symbiota (in a spreadsheet, MS Access, FileMaker Pro, etc.). Keeping this in mind, we suggest choosing whichever method will be the most sustainable for your colleciton's internal data management practices. Please contact the SSH's Help Desk if you would like to publish snapshot data from a Symbiota portal and need further guidance.

Workflow for new data publishers​

To refresh your data after first-time publication​

Special instructions for snapshots​

How do I know if my snapshot already has occurrenceid values?​

Suggested workflows for populating occurrenceid:​

Option 1: Generate GUIDs outside of Symbiota and then bring them into the portal​

Option 2: Use Symbiota-generated GUIDs​