Skip to main content

Setup and Configuration

This guide will walk you through the process of creating a new Sphere and securely connecting it to your Google BigQuery dataset.

Sphere Configuration

This is the primary tab for setting up your Sphere.

  • Name: A unique, human-readable name for your Sphere (e.g., Retail Sales Analytics).
  • Dataset ID: The ID of the BigQuery dataset you want to connect to (e.g., thelook_ecommerce_2). This is not the full project ID, only the dataset name. You can find this in your Google Cloud Console under the BigQuery section.
  • Description: A high-level description of the dataset's purpose. This is for your team's reference.
    • Example: This sphere knows about the e-commerce orders and customers tables.

Schema Description

This is a critical field that provides the AI with context about your data's structure and business logic. The more detail you provide here, the more accurately the AI can generate SQL queries.

Best Practices for Schema Descriptions
  • Explain enum mappings: Clearly define what integer or string codes represent.
  • Define business rules: Describe any specific logic related to your data that isn't obvious from the schema alone.
  • Clarify ambiguous columns: Explain any columns with names that could be misinterpreted.

Example of a schema description:

- The `order_status` field in the `orders` table is an integer. Use 1 for 'pending', 2 for 'processing', 3 for 'dispatched', 4 for 'delivered', and 0 for 'cancelled'.
- The `traffic_source` column indicates where the user came from. 'Email' means they clicked a link in a marketing email.
- All timestamps and dates are in the UTC timezone.
- The `user_id` in the `orders` table is a foreign key that references the `id` in the `customers` table.

Add Key File (Service Account)

To access your BigQuery dataset, DataSphere requires a Google Cloud Service Account Key. This key grants secure, programmatic access to your Google Cloud project.

Creating and Configuring the Service Account

  1. Navigate to Google Cloud Console:
    • Open the Google Cloud Console.
    • In the navigation menu (☰), go to IAM & Admin → Service Accounts.
  2. Create a new Service Account:
    • Click + Create Service Account.
    • Give it a descriptive name, like datasphere-connector.
    • Briefly explain its purpose, e.g., "Service account for Elaniin AI DataSphere to access BigQuery."
    • Click Create and continue.
  3. Grant project-level permissions:
    • In the "Grant this service account access to project" step, add the BigQuery Job User role. This allows the service account to run queries (jobs) within your project.
    • Click + Add another role, search for BigQuery Job User, and select it.
    • Click Continue.
  4. Grant dataset-level permissions:
    • Go to the BigQuery section in your Google Cloud Console.
    • In the BigQuery Explorer, find the dataset you want to connect to.
    • Click the three-dot menu next to the dataset name and select Share and then Manage permissions.
    • In the sharing panel, click Add Principal.
    • In the "New principals" field, paste the email address of the service account you created in Step 2.
    • In the "Select a role" dropdown, search for and select BigQuery Data Viewer.
    • Click Save.
Important Security Note

By granting the BigQuery Data Viewer role at the dataset level, you ensure the service account can only read data from that specific dataset, adhering to the principle of least privilege.

  1. Generate and download the JSON Key:
    • Go back to the Service Accounts tab in IAM & Admin.
    • Click on the email address of the service account you created.
    • Navigate to the Keys tab.
    • Click Add Key → Create new key.
    • Select JSON as the key type and click Create.
Handle Your Key Securely

A JSON file will be downloaded to your computer. Treat this file like a password. It contains credentials that grant access to your cloud resources. Do not commit it to version control or share it publicly.

  1. Upload the key to DataSphere:
    • Return to the DataSphere configuration page in the Elaniin AI Platform.
    • Drag and drop the downloaded JSON file into the Add key file section.
    • Click Create Sphere to save your configuration.